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Abstract We present a new algorithm to construct a (generalized) deterministic 
Rabin automaton for an LTL formula y. The automaton is the product of a co-Büchi 
automaton for y and an array of Rabin automata, one for each G-subformula of y. 
The Rabin automaton for Gy is in charge of recognizing whether FG» holds. This 
information is passed to the co-Biichi automaton that decides on acceptance. As 
opposed to standard procedures based on Safra’s determinization, the states of all 
our automata have a clear logical structure, which allows for various optimizations. 
Experimental results show improvement in the sizes of the resulting automata 
compared to existing methods. 
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1 Introduction 


Linear temporal logic (LTL) is the most popular language for the specification of 
properties of single computations of a program. The verification problem for LTL 
consists of deciding if all computations of a program satisfy a given LTL-formula 
formalizing a property. In the automata-theoretic approach to this problem [VW86, 
VW94, Var99], the negation of the formula is translated into an w-automaton, 
and the product of this automaton with the transition system describing the 
semantics of the program is analyzed. In particular, if this transition system—or 
some suitable abstraction of it—has a finite number of states, then the product can 
be exhaustively explored by a search algorithm, and the property can be checked 
automatically, at least in principle. 

While the size of the w-automaton can be exponential or even double exponential 
in the length of the formula (depending on the kind of w-automaton), typical 
formulae used in practice are either small, or belong to classes for which this 
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blowup does not happen. However, since the transition system is often very large, 
generating small w-automata is still crucial for the efficiency of the approach: Even 
a reduction of a few states in the w-automaton can lead to a much larger reduction 
in the product. 

For functional LTL verification (as opposed to the probabilistic verification 
discussed in the next paragraph), verification algorithms only require to transform 
the LTL formula into a non-deterministic w-automaton, typically a Biichi or 
generalized Büchi automaton and, thanks to intense research in the last decade, 
the problem of generating small automata is well understood, e.g. [GPVW95, 
Cou99, GOO1]. Several tools implement a number of heuristic simplifications (of 
the formula, of intermediate automata generated during the translation, and of 
the final result), and generate Biichi automata of minimal or nearly minimal size 
for most common specifications, e.g. [BKRS12,DL13]. An important factor for this 
success is the fact that the states of the automaton are LTL formulae, which allows 
one to use information about logical equivalence or implication between formulae 
to merge states. 

The picture is still very different for quantitative LTL verification of probabilistic 
systems, i.e., for the problem of computing the probability with which an LTL 
property is satisfied, or deciding whether it exceeds a given bound. The standard 
approach to this problem requires to translate the LTL formula into a deterministic 
w-automaton [BK08,CGK13], typically a deterministic Rabin automaton (DRA). 
Contrary to the functional case, up to 2012 there were no algorithms providing 
a direct translation, all algorithms available proceeded in two steps: first, the 
formula was translated into a non-deterministic Büchi automaton (NBA), and then 
Safra's construction [Saf88]|—or improvements on it [Pit06,Sch09|—were applied 
to transform the NBA into a DRA. At the time of writing this paper this is also 
the default approach adopted in PRISM [KNP11], a leading probabilistic model 
checker, which reimplements the optimized Safra's construction of the 1t12dstar 
tool [Kle05]. While Safra’s construction is a milestone of the theory of w-automata, 
it is also difficult to implement (see e.g. [Kup12]). In particular, it is a monolithic 
construction that can be applied to any NBA, and therefore does not exploit the 
structure of LTL formulae. 

In 2011 the second author initiated a research program for the design and 
implementation of a direct translation of LTL into deterministic w-automata that 
“bypasses” Safra’s construction. As a first result, a translation for the LTL fragment 
containing only the temporal operators F and G was presented in [KE12]. The 
translation yields a deterministic generalized Rabin automaton (DGRA), which can 
then be degeneralized into a standard DRA. Alternatively, a verification algorithm 
was proposed in [CGK13], which does not require to degeneralize the DGRA into 
DRA, but uses directly DGRA, and exhibits the same worst-case complexity. In 
both cases much smaller automata were obtained for many formulae. (For instance, 
while the standard approach translates a conjunction of three fairness constraints 
into an automaton with over a million states, the algorithm of [KE12] yields à DRA 
with 462 states, and—when acceptance is defined on transitions—a DGRA with 
one single state.) Subsequently, the approach was extended to larger fragments of 
LTL containing the X-operator and restricted appearances of U [GKE12, KLG13]. 
However, a general algorithm remained elusive. 

In this paper we present a novel approach that is able to handle full LTL. 
Although the worst-case complexity of our construction is worse than that of the 


traditional translation using Safra’s determinization (triple exponential vs. double 
exponential), our construction consistently produces smaller automata in all our 
benchmark sets. Moreover, our approach is compositional: the DGRA is obtained 
as a parallel composition of automata running in lockstep!. More specifically, the 
automaton for a formula q is the parallel composition of a co-Biichi automaton 
(a special case of DRA) and an array of DRAs, one for each G-subformula of 
y. Intuitively, the state of the co-Biichi automaton after reading a finite word 
corresponds to “the formula that remains to be fulfilled” (we say that the automaton 
monitors the remaining formula). For example, if y = (~a ^ Xa) v XXGa, then 
the remaining formula after reading O{a} is tt, and after reading {a} it is XGa. 
In particular, if the automaton reaches the state tt, it accepts. 

If the co-Büchi automaton never reaches tt, then it needs information from the 
DRAs to decide on acceptance. The DRA for a G-subformula Gy checks whether 
Gy eventually holds, i.e., whether FGw holds. Like the co-Büchi automaton, the 
DRA also monitors the remaining formula, but only partially: more precisely, it 
does not monitor any G-subformula of p, because other DRAs are responsible for 
them. For instance, if Y = a ^ Gb ^ Gc, then the DRA for Gy checks FGa, and 
"delegates" checking FGb and FGc to other automata. Furthermore, and crucially, 
the DRA for Gy may also provide the information that not only FGw, but a 
stronger formula FG(w ^1) holds. For example, the run of the DRA for G(a V Xc) 
on the word c” supplies the information that not only FG(a V Xc), but also the 
stronger formula FG((a V Xc) ^ c) holds. 

'The acceptance condition of the full parallel composition is a disjunction over 
all possible subsets G of G-subformulae, and all possible sets of stronger formulae 
F that the DRAs can check together. Intuitively, the parallel composition accepts 
a word w by means of the disjunct for G and F when w satisfies FG (meaning 
that w satisfies FG) for every Gw € G) and also F. The co-Büchi automaton is 
in charge of checking the conditional property that if w satisfies FG and F, then it 
also satisfies q. 

A previous version of our compositional algorithm appeared in [EK 14]. Since 
the construction was involved and had a number of corner cases, the third author 
mechanically verified it in the Isabelle theorem prover. The exercise revealed that, 
as expected, some minor corrections were necessary, but also exposed a more 
serious bug requiring a substantial change in a lemma. An analysis revealed that 
the smallest to us known formula for which the construction of [EK14] would have 
produced a wrong result is G(Xa V GXb), which has a high chance of surviving a 
large amount of testing. 

To summarize, in contrast to the traditional approaches our novel translation 
is (1) efficient in practice, (2) compositional, (3) preserves the logical structure of 
states, and (4) is proven correct in a theorem prover. 


Related work There are many constructions translating LTL to NBA, e.g., [GPVW95, 
Cou99, DGV99, EH00, SB00, GOO1, GL02, Fri03, BKRS12, DL13]. The one recommen- 
ded by 1t12dstar and used in PRISM is LTL2BA [GOO1]. The version of Safra’s 
construction described in [KB07], which includes a number of optimizations, has 
been implemented in 1tl2dstar [Kle05], and re-implemented in PRISM [KNP11]. 


1 We could also speak of a product of automata, but the operational view behind the term 
parallel composition helps to convey the intuition. 


A comparison of LTL translators into deterministic w-automata can be found in 
[BKS13]. 

Our compositional construction shares the idea of recursive use of automata 
with the construction of [PZ08], where transducers for subformulae, called temporal 
testers, are composed. However, “testers are inherently non-deterministic" [PZ08], 
whereas all our automata are deterministic. 


Apart from LTL verification of probabilistic systems, Safra’s construction can 
also be applied as intermediate step to solve other problems, such as the LTL 
synthesis problem [PR88]. Bypassing Safra’s construction by means of “safraless 
approaches” to synthesis has been the subject of several papers [KV05,KPV06, 
GGRS10]. 


Outline The paper is organized as follows: After Section 2, which introduces basic 
definitions about LTL and w-automata, the next four sections present LTL-to- 
DGRA constructions for increasingly general LTL fragments. As a warm-up, Section 
3 considers the case of G-free formulae. Section 4 considers the case of formulae 
FG», where « has no occurrence of G. Loosely speaking, it gives the recipe to 
construct a single element of the array of DRAs. Section 5 then constructs a 
DGRA for an arbitrary formula FGy as an array of DRAs. Section 6 shows how 
to construct the co-Büchi automaton and the full parallel composition for an 
arbitrary formula. All four sections have the same structure. First, we obtain a 
logical characterization of the words that satisfy a formula of the corresponding 
fragment, and then derive the corresponding automaton from it. 

'The paper continues with Section 7, which describes some optimizations that 
reduce the number of states of the final DGRA, and the size of its acceptance 
condition. Section 8 contains some remarks about the worst-case complexity of our 
construction. Finally, Section 9 introduces Rabinizer, the tool implementing our 
construction, and presents a number of experimental results on different test suites 
of LTL formulae. 

As mentioned above, the correctness proof of our construction has been mech- 
anized using the Isabelle theorem prover. Section 10 shows how to access the 
mechanized proofs, and the relation between this paper and the formal proof. In 
particular, in the paper we sometimes omit cases in proofs by structural induction 
that do not provide special insight. 

Finally, Section 11 presents our conclusions. Some technical proofs are presented 
in Appendix. 


2 Basic Definitions 


We recall basic definitions of w-automata and linear temporal logic, and establish 
some notations. 

In this paper, N denotes the set of natural numbers including zero. We say that 
a property holds for almost every n € N if it holds for all but finitely many natural 
numbers. 


2.1 Alphabets and words 


An alphabet is any finite non-empty set X. The elements of X are called letters. 
A word is an infinite sequence of elements of X. The set of all words is denoted 
by X". A finite word is a finite sequence of elements of X, and the set of all finite 
words is denoted by X*. 

The ith letter of a word w € X" is denoted by wii], i.e. w = w[0]w[1] --- . Given 
i,j € N, we denote by wi; the finite word w[i]w[i + 1]- - - w[j — 1] if à < j, and the 
empty word if j < i. We denote by w; the suffix w[i]w[i + 1]---. 

A (finite or infinite) set of words is called a language. 


2.2 Linear Temporal Logic 


Linear temporal logic (LTL) extends propositional logic with temporal operators. 


2.2.1 Syntax and semantics 


Definition 1 (LTL Syntax) Let Ap be a finite set of atomic propositions. The 
formulae of linear temporal logic (LTL) over Ap are given by the syntax 


e:-tt|ffla|^e|e^e|leve| Xe| Fe| Gy | pUy 
where a € Ap. 


Formulae are interpreted on words over the alphabet 24” That is, a letter is a 
subset of Ap . 


Definition 2 (LTL Semantics) The satisfaction relation E- between words and 
formulae is inductively defined as follows: 


Eq ^ iff wọ and w Ev VO<j<k:iwy Fe 
E ev iff wHo or wH y 


w Ett wEXo iff wi Fg 

w E ff wEFo iff SkKEN: urpey 
wa iff a € w[0] wH Gy iff VkeN:wyu Ev 
wE2p iff wo w E pUy iff Jk € N: wy E v and 
w 

w 


Given two formulae ¢,~w, we say that $ entails y, denoted by o E v,ifwEó 
implies w = w for every w € (er), We say that ¢ and w are equivalent, denoted 
by $ = v, if ó E- v and v E 9. 


2.2.2 Negation normal-form 


In LTL negations can be “pushed inwards”; for instance, we have ~FGa = GaGa = 
GF-a. By pushing negations inwards until all negations appear only in front of 
atomic propositions, we obtain the negation normal form: 


Definition 3 (Negation normal form) A formula of LTL is in negation normal 
form if it is given by the syntax: 


e :-tt|ffla|2-a|e^e| e ve| Xe| Fe| Gy | pUy 
where a € Ap. 


Proposition 1 (Normal form theorem) Every formula of LTL is equivalent 
to a formula in negation normal form. 


Proof Exhaustive application of the following well-known rewrite rules (which 
replace a formula by an equivalent one) brings every formula in negation normal 
form: 


OX ~ X29, Fy ~ Grd, 2G ^ Fod, -(QUy) ^ (^VU(^v^^9)) V Gov. 


Observe that, due to the last rule, the formula obtained by exhaustive rewriting 
can be exponentially longer than the original formula. However, if the formula 
is stored as a DAG (directed acyclic graph) instead of a tree, then the DAG of 
the formula in negation normal form is only linearly larger than the DAG of the 
original formula. 

In the rest of the paper we assume that formulae of LTL are in negation normal 
form, and speak of “a formula" instead of “a formula in negation normal form". 


2.2.3 Propositional entailment, equivalence, and substitution 


Loosely speaking, given two formulae y and v, we say that y propositionally entails 
w if p = v can be proved using only propositional reasoning. So, for instance, Ga 
propositionally implies Ga V Gb, but Ga does not propositionally imply Fa. 


Definition 4 (Propositional implication and equivalence) A formula of 
LTL is proper if it is not a conjunction or a disjunction (i.e., if the root of its 
syntax tree is not ^ or V). The set of proper formulae of LTL over Ap is denoted 
by PF(Ap). A propositional assignment, or just an assignment, is a mapping 
A: PF(Ap) — {0,1}. Given e € PF(Ap), we write A H ọ iff A(y) = 1, and 
extend the relation =p to arbitrary formulae by: 


A Ep v(^Wv iff A}Hp p and A Hp Y 
A E» ovv iff AEp p or A Hp Y 


We say that y propositionally entails y, denoted by y Fp w, if A Ep v implies 
A tp Ņ for every assignment A. Finally, y and w are propositionally equivalent, 
denoted by y =p w, if y Ep w and v Ep v. We denote by [y]p the equivalence 
class of y under the equivalence relation =p. (Observe that y =p w implies e = v 
holds.) 


Definition 5 (Propositional substitution) Let w,x be formulae, and let V 
be a set of proper LTL-formulae. The formula v[V/x]p is inductively defined as 
follows: 


— f 4 = V1 ^v» then y[V/x| p = Vi[V/x| p ^ v» [V/x]p. 
— If y = Vn V v» then y[V/x]p = vi[V/x] P V v»|V/x] p. 
— If w is a proper formula and v € V then v[V/x]p = x, else v[V/x]pP = v. 


2.2.4 The After Function af (p, w) 


Given a formula ¢ and a finite word w, we define a formula af(y, w), read “y after 
w”. Intuitively, if a word ww’ (where w is a finite word) satisfies v, then af (o, w) 
is the formula that holds “after having read w”, that is, the formula satisfied by w’. 
As shown in Proposition 2 below, the converse also holds: if w’ satisfies af (p, w), 
then ww’ satisfies q. 


Definition 6 Let i be a formula and v € 24”. We define the formula af (i, v) as 
follows: 


af(tt,v) = tt af (e ^ v,v) = af (p, v) ^ af (Y, v) 

af(ff,v) = ff af (e V v.v) = af (y,v) V af (Y, v) 
|o[ttifaev  af(Xov) =¢ 

ds) =) E ifa gv af(Go,v) = af(y,v) AGy 

ie fFifacv af(Fy,v) -—af(ev)VFe 


ttifadv af (pUy, v) = af (v, v) V (af (p, v) ^ pUY) 


We extend the definition to finite words: af (p, €) = y; and af (p, vw) = af (af (p, v), w) 
for every v € 24? and every finite word w. Finally, we say that w is reachable from 
Q if v = af (p, w) for some finite word w. 


Example 1 Let Ap = {a,b,c} and y = a V (b U c). We have af(y, {a}) = tt 
af (p, (63) = (b U c), af (p, (cy) = tt, and af (v, 0) = ff. 


We collect a number of simple properties of af, proved in the Appendix. 


Lemma 1 For every formula y and every finite word w € (2?P)*: 


(1) af(q,w) is a boolean combination of proper subformulae of p. 

(2) If af (o, w) = tt, then af (o, ww!) = tt for every w' € (2^P)*, and analogously 
for ff. 

(3) If pı =p p2, then af (p1, w) =p af (p2, w). 

(4) If «v has n proper subformulae, then the set of formulae reachable from p 
has at most 2?" equivalence classes of formulae with respect to propositional 
equivalence. 


Observe that, by Lemma 1(3), the function af can be lifted to equivalence 
classes of formulae w.r.t. propositional equivalence. Abusing language, we also 
denote this lifted function by af. 

We now state the fundamental property of the After function, also proved in 
the Appendix: a word ww’ satisfies a formula g iff “after reading" w the “rest” of 
the word, i.e., the word w’, satisfies af (p, w). 


Proposition 2 Let y be a formula, and let ww’ € (aae be an arbitrary word. 
Then ww’ = q iff w = af (o, w). 


2.3 Transition systems and w-automata 


A deterministic transition system (DTS) over an alphabet X is a tuple T = 
(Q, 2,60,q0) where Q is a set of states, X is an alphabet, ô: Q x X > Qisa 
transition function, and qo € Q is the initial state. If (q,a) = q' then we call the 
triple t = (q,a,q’) a transition, and say that q, a, and q' are the source, the letter, 
and the target of t. We denote by T the set of transitions of T. 

A run of T is an infinite sequence p = tot1--- of transitions such that the 
source of to is the initial state go, and for every i > 0 the target of t; is equal to 
the source of t;41. A transition t occurs in p if t = t; for some à > 0. A state q 
occurs in p if it is the source or target of some t;. Given a word w = aoai: € XY, 
we denote by p(w) the unique run totit2--- of T such that for every i > 0 the 
letter of t; is ai. 

'The product of two DTSs Ti = (Qi, 57,61, q01) and Te = (Q2, X, ô2,q02) is the 
DTS Ti x Te = (Q, 3,6, qo); where Q = Qi x Q2, ó((q.; q2), a) = (61 (a; a), (qa, a)) 
for every qi € Q1,q2 € Q»,a € X, and qo = (qoi, qo2). 


2.3.1 Acceptance conditions and w-automata 


A state-based acceptance condition for T is a positive boolean formula over the 
formal variables Vg = (Inf(S), Fin(S) | S C Q}. Acceptance conditions are 
interpreted over runs. Given a run p of 7 and an acceptance condition o, we 
consider the truth assignment that sets the variable Inf (S) to true iff p visits (some 
state of) S infinitely often, and sets Fin(S) to true iff p visits (all states of) S 
finitely often. The run p satisfies a if this truth-assignment makes a true. The size 
of a condition a is its length as boolean formula. 

A transition-based acceptance condition for 7 is defined exactly as a state-based 
acceptance condition, but replacing the set Vg by the set Vr = (Inf (U), Fin(U) | 
U C T}. In this paper we use state-based or transition-based acceptance conditions, 
depending on what is more convenient. It is well-known that a state-based conditions 
can be transformed into an equivalent transition-based one (i.e., a condition satisfied 
by the same runs). It suffices to replace each occurrence of Inf (S) by Inf (* S), 
where °S denotes the set of transitions with target in S, and similarly for Fin(S). 
Conversely, a transition-based condition can also be transformed into an equivalent 
state-based one by replicating the states. Given a DTS 7 = (Q, X, ô, qo) with a set 
T of transitions we construct the new DTS 7^" with states (qo) UT, a transition 
(qo, a, t) for every transition t = (qo, a, q) of T, and a transition (t, a, t^) for every 
pair t = (q1,a, q2) and t' = (qo, b, qa) of transitions of T. Then, the condition over 
the transitions of T becomes an equivalent condition over the states of T”. 

A deterministic w-automaton over X is a tuple A = (Q, 2,60,90, 0), where 
(Q, 2,6, qo) is a deterministic transition system and a is an acceptance condition. 
A accepts a word w € X* if the run p(w) satisfies a. The language of A, denoted 
by L(A), is the set of words accepted by A. 

An acceptance condition a is a 


— Büchi condition if a = Inf (S) for some S C Q. 

— co-Büchi condition if a = Fin(S) for some $ C Q. 

— Rabin condition if a = V5 (Fin(F;) ^ Inf(1;)) for sets Fi, Dh... Fn, In € Q. 
The pair P; = (Fj,1;) is called a Rabin pair. 


— generalized Rabin condition if a = V5 .,(Fin(Fj) ^ Než Inf (15x)) for sets 
ANNE E NE ANE 


A deterministic Büchi, co-Büchi, Rabin or generalized Rabin automaton is a 
deterministic w-automaton with an acceptance condition of the corresponding kind. 
In the rest of the paper we shorten deterministic Rabin automaton to DRA, and 
the generalized version to DGRA. 

Observe that Biichi and co-Biichi conditions are special cases of Rabin conditions. 
Further, every generalized Rabin automaton can be degeneralized into an equivalent 
Rabin automaton, which however may incur an exponential blowup [KE12]. The 
generalized Rabin condition arises naturally when considering intersection of Rabin 
automata. Observe that we do not need to consider AS Fin(Fjk), but only 
Fin(F;), because Az? , Fin(Fjx) is equivalent to Fin(UtL, Fyn): 

The following results are well known. 


Proposition 3 Given DRAs Rı and Rə recognizing languages Li and L2, respec- 
tively, we can construct DRAs, denoted Ri U Ra and Ri N Ra, recognizing Lı U Le 
and Lı N Lo, respectively. Moreover, the transition system of both Ri U R2 and 
Rı N Ro is the product of the transition systems of Ri and Ha. 


Proposition 4 Let X be a finite set of indices, and let Ri = (Q, X, ô, q0, œi) be a 
family of DRAs, one for every index i belonging to some finite set I of indices, all of 
them with the same underlying transition system. Then Ru = (Q, X, ô, qo, Vic oi) 
is a DRA recognizing LJ; c x L(Ri), and Ra = (Q, 2,6, qo, A; c x oi) is a generalized 
DRA recognizing c x L(Ri). 


3 Automata for G-free Formulae 


We present a translation of G-free formulae (i.e., formulae without any occurrence 
of the G-operator) into a deterministic w-automaton with a very simple acceptance 
condition, which can be expressed both as a Biichi and a co-Biichi condition. The 
translation is by no means novel, but it serves as a warm-up for the next sections, 
which consider more general classes of formulae. Moreover, the section allows us to 
introduce the general scheme we use to design translations: first, we give a logical 
characterization theorem characterizing the words that satisfy a formula of the 
given class, and then we construct an automaton which accepts iff the condition of 
the characterization holds. 


Theorem 1 (Logical characterization theorem I) Let o be a G-free formula 
and let w be a word. Then w = iff there exists i > 0 such that af (p, woj) =p tt 
for every j 2 à. 


Proof By Lemma 1(2) it suffices to show that w = q iff there exists 7 > 0 such that 
af (p, woi) =p tt. (In the rest of this proof we use Lemma 1(2) without explicitly 
mentioning it.) 

(<): Assume there exists i > 0 such that af (p, woi) =p tt. Then w; = af (v, woi). 
By Proposition 2, we get w = woiwi = v. 

(=): Assume w — y. We proceed by structural induction on y. We only consider 
two representative cases. 


— p=a. Since w |= y we have w = vw" for some word w’ and for some v € Ap 
such that a € v. By the definition of af we have af(a,v) =p tt, and, since 
v = o1, we get af(y,wo1) =p tt. 

— y= q1Ug». By the semantics of LTL there is k € N such that wy = p2 and 
we = y1 for every 0 € £ < k. By induction hypothesis there exists for every 
Ox£«ckani-4Ésuch that af(y~i,we) =p tt and there exists ani > k 
such that af (p2, wei) =p tt. Let j be the maximum of all those i's. We prove 
af (p1Uq2, woj) =p tt via induction on k. 


—k=0. 
af (p1U p2, woj) 
= af (v2, wos) V (af (p1, woj) ^ af (p13Uq2, w15)) (def. of af) 
=p tt V (af (p1, woj) ^ af (yi U p2, w1;)) (af (p2, Wkj) =P tt) 
=p tt 
—k>0O. 
af (1U p2, woj) 
= af (p2, woj) V (af (p1, woj) ^ af (1 Uva, w15)) (def. of af) 
= hag on ton) V Cente Ursa) (af (91, woj) =p tt) 
=p af (qa, woj) V (tt ^ tt) (ind. hyp.) 
=p tt 


We derive from Theorem 1 a deterministic w-automaton for a given G-free 
formula y. The states of the automaton are equivalence classes of formulae under 
propositional equivalence. The fundamental design idea is: after reading a finite 
word w, the current state of the automaton must be af (p, woj). So we take the 
equivalence class of af(y,¢€) = ọ as initial state, and the function af itself as 
transition function. By Theorem 1, a word satisfies q iff its run in this automaton 
visits the state [tt] p. Since we have af (tt, v) = tt for every v € 24?, the run visits 
[tt] p iff it visits [tt] p infinitely often, or if it visits all other states only finitely 
often. So we can take F — ([tt] P) as Büchi condition. 


Definition 7 Let y be a G-free formula. Let Reach(y) denote the set of equiva- 
lence classes of the formulae reachable from y w.r.t. propositional Odys 'The 
transition system of p is the deterministic transition system 7 (v) = (Q, 24? qo, à) 
where 


- Q is the quotient of Reach(y) under propositional equivalence. 
(In other words, [w]p is a state of T(y) iff af (v, w) = v for some finite word 
w.) 

- qo = |e]p, the equivalence class of q. 

- é([v]p, v) = [af (v, v)| P for every [v] P € Q and every v € 24°. 
(Le., there is a transition [v] p —> [v]p iff af (o, v) = v.) 


The Büchi automaton for y is the tuple B(y) = (Q, 24°, qo, 6, F), where F = 
{[tt]p}. Observe that it can be also seen as a co-Biichi automaton with F = 


Q\ ([tt] P. 


Example 2 Figure 1 shows the automaton for the formula y = a V (b U c). We 
assume Ap = (a, b, c}. The alphabet 2^? contains 8 elements, and so every state has 
8 outgoing transitions. To avoid cluttering the figure, we use a boolean-function-like 
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Fig. 1: Büchi (or co-Büchi) automaton for a V (b U c). 


notation for transitions. For example, q2 —> q3 denotes that there is a transition 
from q2 to qs for every subset of 2^? containing c. So, actually, gg —2 qa stands 


for four different transitions. Similarly, qi paie q3 means that there is a transition 
from qı to qs for each subset of 2^? that either contains a, or does not contain a 
and contains c. 


Theorem 2 Let p be a G-free formula. Then L(B(y)) = L(y) 
Proof Immediate consequence of Theorem 1 and the definition of B(y). 


Remark: Computing B(q) requires a data structure to represent the equivalence 
classes of the formulae of Reach(q) with respect to propositional equivalence. Let 
PF (vy) denote the set of proper subformulae of y. By Lemma 1(1), a formula of 
Reach(y) is a boolean combination of formulae of PF (p). Hence, every formula of 
Reach(q) induces a boolean function over PF(q), and two formulae of Reach(y) 
are propositionally equivalent iff they induce the same function. In other words, 
the equivalence class of a formula can be identified with its boolean function. In 
our implementation, described in Section 9, we use Binary Decision Diagrams as 
data structure for boolean functions. It is well known that with this data structure 
propositional equivalence can be checked in constant time. Other operations have 
exponential worst-case complexity, but in all our experiments the time needed to 
perform them is negligible. 


4 DRAs for Simple FG-Formulae 


We introduce the main building block of our paper: a procedure to construct a DRA 
for formulae FGy where q is G-free, i.e., contains no occurrence of G. (Notice 
that even the formula FGa has no equivalent deterministic Büchi automaton.) 
As in the previous section, we first characterize the words w satisfying a formula 
FGy where o is G-free, and then show how to construct a DRA that accepts iff 
the condition of the characterization holds. However, in this section we divide this 
step into two parts. We first introduce an auxiliary automata model, called Mojmir 
automata”, and show how to construct a Mojmir automaton recognizing L(FGy). 
(Mojmir automata are designed to make this construction intuitive and easy to 
grasp.) Then we show how to transform Mojmir automata into equivalent DRAs. 


? Named in honour of Mojmír K¥etinsky, father of one of the authors 
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4.1 Logical characterization 


The logical characterization of the words satisfying FGy is an easy consequence of 
Theorem 1. 


Theorem 3 (Logical characterization theorem II) Let FGy be a formula 
such that q is G-free. Then w E: FG iff for almost every i € N there exists j > i 
such that af (p, wij) =P tt. 


Proof By the semantics of LTL, w E FGy iff w; H o for almost every i € N. 
By Theorem 1, w = FG iff for almost every i € N there exists j > i such that 


af (p, wij) =p tt. 


4.2 Mojmir automata 


By the definition of LTL, we have w = FGy iff w; = ¢ for all but finitely many 
i > 0. Let Ay be the deterministic co-Büchi automaton recognizing L(y). From a 
mathematical point of view, we can recognize L(FGy) with the help of an infinite 
array of copies of A,. The ith automaton reads wi, i.e., it skips the first (i — 1) 
letters of the input word, and then starts reading. Therefore, the i-th automaton 
accepts iff w; E- y. The array accepts iff almost every array element accepts. Figure 
2 shows the first four elements of the array for the formula FG(a V (b U c)). The 
figure shows the state of the elements after reading (abc) (abc) (abc). For example, 
the automaton on the left has read all three letters, and reached state qs, graphically 
displayed by putting a token on the state, while the next one has only read the 
last two letters, and reached state qo. The last automaton has not yet read any 
letter, and so it is currently in state q1. 


We now observe that the complete array can be replaced by one single automaton 
that handles all the tokens simultaneously. We call such an automaton a Mojmir 
automaton. The bottom part of Figure 2 shows the configuration of the Mojmir 
automaton corresponding to the array at the top. After reading (abc) (abc) (abc), 
the automaton has created four tokens, labelled with their birthdates. Intuitively, 
when the automaton reads a letter it moves all tokens according to the transition 
function, and then puts a fresh token in the initial state, labelled with the position 
of the letter. Initially there is a unique token at the initial state, labelled by 0. The 
automaton accepts if almost every token eventually reaches an accepting state. 


Definition 8 A Mojmir automaton is a tuple M = (Q, X, qo, ô, F), where (Q, X, go, 9) 
isa DTS and F C Q is a set of accepting states satisfying ó(F,v) C F for every 
v € X, i.e., states reachable from accepting states are also accepting. 

The run of M over a word w = w[0]w[1]--- € X is the infinite sequence 


(qo) (qo. q1) (a0. q1, 42) (46,92, 93,93) °° 


where 


time | jo if token — time, 
diaken = (qiie, |, w[time —1]) if token < time 


token 
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vd ON 


true true true true true true true true 


Fig. 2: The top row shows the first four elements of the array of co-Büchi automata for 
FGí(a V (b U c)) after reading abc abc abc. At the bottom, the corresponding configuration of 
the Mojmir automaton. 


The position of a token at a time in the run is given by the function run, : N x 
N— QU {L}, defined as follows: 


runw(token, time) — f , 
a if token > time 


Lem if token < time 
For every time t € N, we denote by conf ,,(t) the function defined by 
token + runu (token, t)) 


We call conf „(t) the configuration of the run of M on w at time t. The run of M 
on w is accepting if for almost every token € N there exists time € N such that 
runw(token, time) € F. 


Given a G-free formula y, the Mojmir automaton equivalent to FGy has exactly 
the same syntactic structure as the Biichi automaton for v: only the notions of 
run and acceptance are different. 


Definition 9 Let y be a G-free formula. The Mojmir automaton for FG y is 
M(¢) = (Reach(y), 2^", [zv] p, af, {[tt] P}). 
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Fig. 3: Mojmir automaton for FG(a V (b U c)), and matrix representation of run, (token, time) 
for w = abc abc abc:--. 


Since M(y) accepts iff almost every token eventually reaches an accepting 
state, M (y) accepts a word w iff w = FGy, and so we have: 


Theorem 4 Let q be a G-free formula. Then L(M(y)) = L(FGy). 


Example 3 Figure 3 shows the Mojmir automaton for FG(a v (b U c)) and the 
matrix representation of runw(token, time) for w = abc abc abc---. The configura- 
tions of the run are given by the columns of the matrix. For instance, conf, (2) 
is the mapping 0+ q3, 1 œ q2,2 q1, Vi 2 3 : i — L given by the third column, 
indicating that after two steps the tokens 0, 1, 2 are in states q3, q2, q1, respectively, 
and other tokens do not exist yet. 


In the rest of the section we show how to construct a deterministic Rabin 
automaton equivalent to a given Mojmir automaton. In Section 4.3 we define 
an abstraction that assigns to each configuration conf,,(t) of a run an abstract 
object sr, (t), called a state-ranking. Since the run of M on a word w is completely 
characterized by the sequence of configurations conf,,(0) conf, (1) conf u (2) -- 
the abstraction also abstracts a run into the infinite sequence of state-rankings 
sro (0) sru (1) sr. (2) ---. Sections 4.4 and 4.5 show that the abstraction has the 
following properties: 


1. There is an easily computable function that given sr,,(t) and w[t + 1] returns 
sry (t + 1). (Lemma 3) 

2. A run is accepting iff its corresponding abstract run satisfies a certain Rabin 
condition. (Definition 16) 


Finally, Section 4.6 derives the deterministic Rabin automaton. As the reader can 
expect, the automaton will have the state-rankings as states, the function of (1) as 
transition function, and the condition of (2) as acceptance condition. 


4.3 State-rankings 
Intuitively, a state-ranking of a Mojmir automaton M is a ranking of the states 


of M. Our state-rankings are allowed to be partial, that is, to leave some states 
unranked. 
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Definition 10 Let M be a Mojmir automaton with n states. A state-ranking of 
M is a partial injective function sr: Q — {1,...,n}, such that if the image of sr 
contains i, then it also contains j for every j < i. When sr(q) is undefined, we write 
sr(q) = L. The set of state-rankings of M is denoted by SR. 


The state-ranking srw(t) associated to conf ,,(t) is the result of performing 
a sequence of abstraction steps, which we illustrate on an example. Consider a 
Mojmir automaton M with states (qo, q1, 92, 93, 94, 95,96}. Assume that, after the 
first 8 steps of its run on some word, M has reached the following configuration, 
where for each state we give the set of tokens currently at that state: 


qo qı q2 dq3 G4 95 46 (1) 
( {3,8} {1,2} 0 {5,7} {4} {6} (0) ) 

Assume further that states q5,qe are sinks, meaning that ó(qs,v) = qs and 
ó(qe,v) = qe for every alphabet letter v®. We start the abstraction process by 
discarding the information about tokens in sinks. We use the symbol L to denote 
this, and obtain: 


qo qi q2 43 q4 q5 46 
( {3,8} {1,2} 0 {5,7} {4} 1 1) 

We continue by keeping only the oldest token of each state (that is, the one 
with the smallest number). If the state is not populated by any token, again we 
just write L. We obtain: 

qo qı q2 93 q4 95 q6 
(3 1L5 411) 
We call tokens 3,1,5 and 4 the senior tokens of the configuration, or just the 
seniors. 

Since a run has infinitely many tokens, the number of possible abstract config- 
urations of the automaton is still infinite. So we discard even more information. 
We throw away the identities of the senior tokens, and keep only their relative 
seniority rank: the oldest senior token has rank 1, the second oldest rank 2, etc. 
We obtain the state-ranking 


qo qı q2 93 q4 q5 q6 
(21143111) 

It is useful to think of the set of tokens at a state as the partners of a partnership 
firm. The senior partner is the oldest token. The name of the firm is the rank of the 
senior partner. For instance, the firm 2 at state qo has tokens 3 and 8 as partners. 

Let us formally define the rank rky(r,t) of token 7 at time t, and the state- 
ranking sr, (t) at time t. 


Definition 11 Let M = (Q, X, qo, ô, F) be a Mojmir automaton with n states. A 
state q € Q is a sink if q # qo and ó(q,v) = q for every v € X. 

Let w € X" be a word, and consider the run of M on w. Given two tokens 
7,7T' € N, we say that r is older than T’ if r < 7'. The senior of token T at time 
t > 7 is the oldest token 7' such that runw(T, t) = runw(T',t). If a token is its own 
senior, then we call 7 a senior (at time t). 

The rank of token 7 at time t > 7, denoted by rkw(r, t), is defined as follows: 


3 For technical reasons, we also decree that the initial state cannot be a sink. 
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— If runw(7,t) is a sink, then rku(r, t) = L (we say that 7 is unranked at time t). 

— If run.(7,t) is not a sink, then let s be the senior of token 7 at time t. The 
rank rk, (T, t) is the number of senior tokens T’ such that run, (7',t) is not a 
sink and 7' < s. 


(Observe that runw(T,t) = runw(T',t) implies that r and r’ have the same 
seniors, and so that rku(r, t) = rkw(r’,t); so all tokens at the same state get the 
same rank.) 

Finally, the state-ranking at time t, denoted by sr (t), is the mapping Q > N 
that assigns to each state q € Q its state-ranking srw(t,q) € {1,...,n}, defined as 
follows: 


- If q is a sink, then sru(t,q) = L. 

- If q is not a sink and no token 7 satisfies runw(r,t) = q, then srw(t,q) = L. 

- If q is not a sink and some token 7 satisfies runy(r,t) = q, then srw(t,q) = 
rk (T, t). 


Example 4 Consider for example token 7 in the configuration (1). The senior of 7 
is 5. The seniors are 3, 1,5, 4. Since all seniors are at least as old as 5, the rank of 
token 7 is 4. Since the configuration is the result of reading the first 8 letters of a 
word w, we have rk, (7,8) = 4. 


While the birthdate of a token does not change along a run, its rank can change, 
and for two different reasons. Assume the current rank of a token 7 is 4. If the 
firm of rank, say, 3, moves to a sink, then it “disappears”, and the rank of 7 is 
upgraded to 3. If the token's firm merges with the firm of rank, say, 2, the rank of 
T is upgraded to 2. In both cases, we observe that, as long as the token does not 
reach a sink, its rank can only improve (get older) along a run. 


Lemma 2 Let M = (Q, Z,q0,6, F) be a Mojmir automaton and let w € X? be a 
word. For every token r €N: 

- if rkw(7,t) = L for some t € N, then rkw(r,t’) = L for every t > t. 

- ift € ' and rk (r,t), rkw(7,t’) € N, then rkw(7,t) > rkw(7,t’). 


Proof Follows easily from the definitions. 


4.4 Computing the successor of a state-ranking 


Recall that the run of a Mojmir automaton on a word w is completely determined 
by the sequence of configurations conf, (0) conf, (1) conf,,(2) ---. To this sequence 
corresponds a sequence sr, (0) sr, (1), sru (2) --- of state-rankings. We show that 
sra (t + 1) can be directly computed from sr,(t) and the letter w[t + 1]. More 
precisely, we define a function nat: SR x X — SR and show that it satisfies 
nat(srw(t), wit + 1]) = srw (t + 1) for every time t. 

Let srw(t) be the state-ranking 


qo q1 q2 93 q4 q5 q6 
(2114311) 


Assume w(t + 1] = v for some v € X, and assume further that 


ó(qo,v) — qs — (qu v) =q2 —9(qs,v) (qa, V) = qa 
We obtain sr, (t + 1) in four steps: 
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(i) Move all senior tokens according to 6. 
The token of rank 2 at go moves to the sink qs (recall that q5 and qe are sinks) 
and “disappears”. The tokens of ranks 1 and 4 move to state qo. The token of 
rank 3 at q4 moves to q3. We obtain: 


qo q1 q2 43 q4 45 q6 
(LL {1,4} 3 LLL) 


(ii) If a state holds more than one token, keep only the most senior token. 
Only the token of rank 1 survives in q2. Intuitively, the firms with rank 1 and 
4 merge, and 1 becomes the senior partner. 


qo q1 q2 93 94 q5 q6 
(1113111) 


(iii) Recompute the seniority ranks of the remaining tokens. 
The token of rank 3 is upgraded to rank 2. 


qo q1 92 93 94 q5 q6 
(14112111L) 


(iv) If there is no token on the initial state, add one with the next lowest seniority 
rank. 
We add a token to qo of rank 3. 


The corresponding formal definition is: 


Definition 12 Let M = (Q, X,qo,ô, F) be a Mojmir automaton with n states 
and a set S of sinks. Let sr be a state-ranking of M, and let v € X. For every 
q € Q, the set of ranks of sr that move to q under v, denoted by mvto(q), is given 
by: 


) 
sr(qd) Z L^ó(q,v) —qyU(n) ifq-— qo 


The state-ranking nzt(sr, v) is defined with min(@) = oo by: 


muto(q) = fel | sr(d) # LA Sl, v) = q) if Z qo 


RS Hg € QV S | min(mvto(g)) € min(mvto(q))) if q£ S and mvto(q) z 0 
us L otherwise 


We get the following lemma. 


Lemma 3 Let M be a Mojmir automaton and let w be a word. Then sry(t+1) = 
nat(srw(t), w[t + 1]) for every t > 0. 


Proof (Sketch.) The key observation for the proof is that nzt(sr(t), w[t + 1]) 
computes for a state q the set of senior states q’ at time t + 1 and then takes the 
cardinality of this set as a value. This coincides with the definition of sr, (t + 1). 
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Fig. 4: A Mojmir automaton for a V (b U c) and its corresponding DRA. 


We already have all we need to define the states and transition function of 
the DRA equivalent to a given Mojmir automaton (although not the acceptance 
condition). The states of the Rabin automaton are the state-rankings, and the 
transition function is given by nzt. 


Example 5 Figure 4 shows our running example on the left, and the states and 
transitions of its corresponding Rabin automaton on the right. Since states q3 and 
qa are sinks, state rankings only rank states q1 and qo. The initial state-ranking is 
(1,.L). The only other state-ranking reachable from it turns out to be (2, 1). 


4.5 Deciding acceptance of an abstract run 


We define a Rabin acceptance condition that turns the transition system above into 
a DRA equivalent to the Mojmir automaton. We start by classifying the tokens of 
a run of the Mojmir automaton. 


Definition 13 Let M = (Q, X, ô, qo, F) be a Mojmir automaton and let w be a 
word. A token 7 € N of the run of M on w 


— squats if it never reaches a sink 
(that is, if runw(r, t) € Q \ S for every t € N); 

— fails if it eventually reaches a non-accepting sink 
(that is, if there exists t € N such that runw(T,t) € SN F); 

— succeeds if it eventually reaches an accepting state, sink or non-sink 
(that is, if there exists t € N such that runw(r,t) € F). 


Further, we say that a token succeeds at rank i if it has rank i immediately before 
entering the set of accepting states, i.e., if there is t € N such that runw(T,t) ¢ 
F \ {qo}, runw(r,t +1) € F, and rky(r,t) = i.* 


Observe that the three classes are not disjoint. More precisely, a token either fails, 
succeeds, or squats in non-accepting states. By definition, a Mojmir automaton 
accepts a word w if all but finitely many of the tokens generated during the run 


4 observe that in the special case qo € F (all states are accepting), the first move of each 
token is considered succeeding. 
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on w succeed (recall that tokens that reach an accepting state stay within the set 
of accepting states). So, given the abstract run of M on w, our task is to find a 
Rabin condition equivalent to “only finitely many tokens fail and only finitely many 
tokens squat in non-accepting states”. The condition equivalent to “only finitely 
many tokens fail” is simple: since a token fails when it moves into a non-accepting 
sink, we stipulate that transitions moving tokens into non-accepting sinks can only 
occur finitely often. 

Finding a condition equivalent to “only finitely many tokens squat in non- 
accepting states” is a bit more involved. Observe that, since a squatter T never 
reaches a sink, it has a rank at every moment in time. So, if infinitely many tokens 
squat in non-accepting states, then, since they are all confined within Q \ (SU F), 
infinitely many firm merges must take place in this set of states. This suggests the 
following definition: 


Definition 14 Let M = (Q, X,6, qo, F) be a Mojmir automaton and let w be 
a word. Let 7,7’ € N be two tokens such that r < r'. We say that 7 and 7’ 
merge during the run of M on w if there is t € N and a state q ¢ F such that 
runw(T,t) = q = runw(7',t), and one of the two following conditions hold: 


— r' < t and runy(r,t — 1) 4 runw(r',t — 1). 

(Both tokens already existed at time t — 1, and were at different states) 
— 7! —t. 

(Token 7' is created at time t.) 


Further, we say that the tokens merge at rank i if rk (T, t) =i. 


Notice the condition q ¢ F in the definition: we reserve the term “merge” for the 
merges occurring in non-accepting states. 


If two tokens merge at some time t, then from that moment on they follow the 
same trajectory, and so we have: 


Lemma 4 Let M = (Q, Z,6,qo, F) be a Mojmir automaton and let w be a word. 
Let r,7T' € N be two tokens that merge along the run of M on w. Then either both 
T and 7’ fail, or both succeed at the same rank, or both squat. 


Proof By the definition of merge there is a time to such that runy(r7, to) =q ¢ F 
and runw(7,t) = runw(T', t) for all t > to. We proceed by case distinction and only 
consider two cases. 


— T fails. This means that the token 7 moves at some point to a non-accepting 
sink and stays there forever. Let us call this time t’. Without loss of generality 
we assume that the merge happens outside the sinks S and we have t’ > to. 
Hence we have runw(T',t^) = runw(r, t’) = qs and thus 7’ also fails. 

— T succeeds at rank i. Thus the token 7 moved at some time t’ > to from the 
non-accepting states to the accepting states with rank i. Since 7 and 7' already 
merged and tokens that are in the same state have the same rank, also 7’ 
succeeds with rank i. 


We can now formulate and prove the main theorem of the section, presenting 
conditions equivalent to “only finitely many tokens fail" (condition (1)), and “only 
finitely many tokens squat in non-accepting states" (condition (2)): 
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Theorem 5 Let M = (Q, X, ô, qo, F) be a Mojmir automaton and let w be a word. 
M accepts w if and only if the run of M on w satisfies the following two conditions: 


(1) Finitely many tokens fail. 
(2) There is a rank i such that 
(2.1) infinitely many tokens succeed at rank i, and 
(2.2) finitely many pairs of tokens merge at rank older than i, i.e. with a rank 
j«i 


Proof (=): Assume M accepts w. Then almost every token of the run of M on w 
succeeds. Therefore, since no token can succeed and fail, (1) holds. 

Let i be the smallest rank satisfying (2.1) (since almost all tokens succeed and 
the number of ranks are finite, such an i exists). We prove that i satisfies (2.2). Let 
Mi be the set of pairs (T, T’) of tokens such that 7 < T’ and 7 and 7' merge at rank 
older than i. We prove that M; is finite. By Lemma 4 either both 7 and 7’ succeed, 
or none succeeds. Let 5; be the set of pairs (r, 7^) € Mi such that both 7 and 7’ 
succeed. Since M accepts w, almost every token succeeds, and so M; X Sj is finite. 

It remains to prove that 5; is finite. By the definition of i, it suffices to prove 
that for every (T, T) € S; both T and T’ succeed at a rank older than i. Let to be 
the time at which 7 and r’ merge. By the definition of a merge, at time to neither T 
nor T’ have reached the set of accepting states. Since r and T’ merge at rank older 
than i and two merged tokens always have the same rank, we have rk, (T',to) < i. 
Let t1 > to be the time at which both tokens enter the set of accepting states. 
By Lemma 2(2), we have rkw(7, 1) < i and rku(7',t1) <i, and so both 7 and 7’ 
succeed at a rank older than i. 

(«): If qo € F then by the definition of Mojmir automata M accepts every 
word, and we are done. So assume qo ¢ F. 

By the definition of squatting, a token 7 squats iff rku(r,t) € N for every 
t > Tr. By Lemma 2, the rank of 7 can only get older, and so there is a time t such 
that rk (T, t) = rkw(r, t^) for every t > t. We call this rank the stable rank of 7, 
denoted by strk, (7). The following lemma, proved in the Appendix, shows that 
all stable ranks are old. 


Lemma 5 Let i be the rank of condition (2). If the rank of 7 stabilizes, then 
strkw(T) <i. 


We now use the lemma to prove the result by contradiction. Assume M does 
not accept w. Then, infinitely many tokens do not succeed in the run of M on 
w. Since by (1) only finitely many tokens fail, infinitely many tokens squat in 
non-accepting states. By Lemma 5, their stable ranks are all older than i. So there 
is a rank j « i such that infinitely many tokens have stable rank j. Let 7 be one of 
these tokens, and let t be the time at which its rank stabilizes. All tokens born 
after t whose rank stabilize at j eventually merge with 7. Therefore, infinitely many 
pairs (7, T’) merge at rank i. But this contradicts our assumption that (2.2) holds. 


We conclude the section with a definition that will be important in Section 6. 
Definition 15 Let M be a Mojmir automaton and let w be a word. We say that 


M accepts w at rank i if M accepts w and the rank of condition (2) in Theorem 5 
is i. 
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Note that a word can be accepted at several ranks. In Section 6.2 we will show 
that the ranks at which the automaton M(y) of a formula q accepts a word carry 
useful information. 


4.6 From Mojmir automata to deterministic Rabin automata 


From Theorem 5 we can easily derive a deterministic Rabin automaton equivalent to 
a given Mojmir automaton. More precisely, we show how to construct an automaton 
with a Rabin condition on transitions. Applying the construction of Section 2.3.1, 
this automaton can be transformed into one with a Rabin condition on states. 


Definition 16 Let M = (Q,2,q0,6,F) be a Mojmir automaton with a set S 
of sinks. The deterministic Rabin automaton (AM) = (QR, X, gor, ÔR, om) is 
defined as follows: 


— QR is the set SR of state-rankings of M; 

— qom is the state-ranking satisfying qor(qo) = 1 and gor(q) = L for every 

q # qo; 

Óm (sr, v) = nat(sr,v) for every state-ranking sr and letter v; 

— AR = vlel Pi, where the ith Rabin pair is P; = (fail U merge(i), succeed (i)), 
and the sets fail, merge(i), and succeed(i) are defined as follows. A transition 
(sr, v, sr’) € de belongs to 

— fail if there exists q € Q such that sr(q) € N and ó(q,v) € SN F. 
— succeed(i) if there exists q ¢ F such that sr(q) = i and ó(q,v) € F, or 
qo € F and sr(qo) =i.” 
— merge(i) if 
e there exists a state q € Q \ F and distinct states q1,q2 € Q such that 
ó(qi,v) = q = ó(qa,v), sr(q1) < i, and sr(q2) Æ L; or 
e qo ¢ F, and there exists a state q such that ó(q, v) = qo and sr(q) < i? 


R(M) accepts a word w at rank j if Pj is an accepting pair on the run of R(M) 
on w. 


Example 6 Let us determine the accepting pairs of the DRA on the right of Figure 
4. We examine several representative cases. 


— tı moves tokens from qi to the accepting sink qs. Since sr(qi) = 1, transition 
t4 belongs to succeed(1). Since we can safely ignore sinks (qs, q4) and states 
that are empty (q2) for testing membership, we are done with t4. 

— tz takes tokens from the initial state and moves them to the non-accepting sink qa. 
This matches the definition of fail, with sr(qi) € N and ó(q1, abc) = qa € SV F. 
Hence te € fail. 

— t3 moves tokens from qi to q2. Since q2 is neither a sink nor an accepting state, 
t3 is not contained in fail or in any succeed set. Moreover, since sr(q2) = L, it 
does not belong to any merge set either. 


5 If qo is accepting then, by the definition of Mojmir automaton, all states reachable from qo 
are accepting. This condition covers the corner case in which no transition into an accepting 
state is possible, because all states are accepting state. 

6 In this case there is a merge between the token at q and the token newly created on state 
qo. 
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— tg moves tokens from qı and q2 to the non-accepting sink qs. Hence tg € fail. 
Moreover, the transition merges the tokens from qı and q2 in q3 with rank 
sr(qi) = 1, and so tg is also contained in merge(2). 


Altogether we obtain 


oes merge(1) = 0) succeed(1) 
Jai — Ata trta merge(2) = {ts,tg} ^ succeed(2) = (t4, te, t7} 

It is easy to see that the runs accepted by the pair Pı are those that take 
12, t7, ta only finitely often, and visit (1,.1) infinitely often. They are accepted at 
rank 1. The runs accepted at rank 2 are those accepted by P5 but not by Pi. They 
take t1,t2, t5, t6, tz, ta finitely often, and so they are exactly the runs with a 12 
suffix. 


Lemma 6 Let M = (Q, X, 1,6, F) be a Mojmir automaton, and let R(M) be its 
corresponding Rabin automaton. For every word w, the sequence conf ,,(0) conf ,, (1) - - - 
is the run of M on w iff sr (0)sru (1) --- is the run of R(M) on w. 


The Rabin condition of this automaton checks conditions (1) and (2) of Theorem 
5. Consider a transition conf,,(t) — conf, (t + 1) between two configurations 
of .M in which some token moves into a non-accepting sink. Then the transition 
sro (t) 4 srw(t +1) clearly belongs to the set fail, and vice versa. Similarly, 
transitions of succeed(i) correspond to transitions of M that make some token 
succeed at rank i, and transitions of merge(i) correspond to transitions of M that 
merge two tokens at rank i. So we obtain: 


Theorem 6 Let M be a Mojmir automaton, and let R(M) be its corresponding 
Rabin automaton. Then L(M) = L(R (MD). Moreover, for every w € L(M) both 
M and R(M) accept w at the same ranks. 


5 DRAs for Arbitrary FG-Formulae 


We show how to translate formulae of the form FGy into DRAs. Thanks to 
the results of Section 4, it suffices to translate them into Mojmir automata. We 
show that the Mojmir automaton for a formula can be defined compositionally, 
as an intersection of Mojmir automata. The next proposition shows that Mojmir 
automata are closed under union and intersection (the proof can be found in the 
Appendix). 


Proposition 5 Let Mı = (Qi, Z,qo1, 61, F1) and M2 = (Q2, X, qo2, 62, F2). Let 
Q = Q1 x Qa, let qo = (q01, q02), and let 6: Q x X — Q be the function given by 
ó(q1, q2, v) = (ô1 (q1, v), 52(q2,v)) Then the tuples 


Mı AM2 = (Q, X, qo, à, Fix F3) 
Mı U M2 = (Q, X, qo, à, (F1 x Q9) U (Qi x F2)) 


are also Mojmir automata, and moreover L(Mı N M2) = L(M1) A L(M2) and 
L(Mı U M2) = LM) U LM»). 
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Fig. 5: Mojmir automaton for words satisfying FGwv, but not FGy2. 


5.1 A compositional construction: Intuition 


We present the intuition behind the construction by means of an example. Consider 
the formula 
p = FG(Fa V (G(a V Fb) ^ c))) 


We use the abbreviations Y2 = a V Fb and y = Fa V (Gwe ^ c), and so we also 
refer to the formula as FGy1. 

We cannot directly apply the construction of the last section because FG 
contains the G-subformula Gw2. However, since v» does not contain any G- 
subformula, we can construct a Mojmir automaton M(w2) for FGy2. We use this 
fact to define the automaton M (11) as the union of two Mojmir automata: The 
first automaton recognizes all words satisfying FGw but not FG«» (and perhaps 
some other words satisfying FG»), while the second recognizes all words satisfying 
FG and FGw» (and perhaps some other words satisfying F'Gwv1). Consider for 
example the words 


wi = (abc abc)” w2 = (abc)” ws = (abc)? 


We have 101 E FGwi ^ FG», W2 = FGwi ^ FGwy» and 103 2 FGy. So both 
automata will reject w3. Moreover, the first automaton will accept w1, and the 
second w2. 

The first automaton, called M(w1,@) in Section 5.2 below, is just the Mojmir 
automaton for the formula FG«[Gwv»/ff], i.e., the result of substituting Gwe by ff 
in FG. It is easy to see that, since 71 is in negation normal form, FGwv[Gw»/ff] 
logically implies FG, and so every word accepted by M(w1,@) satisfies FGy1. 
Moreover, observe that if a word w does not satisfy FG», then the formula Gwv» 
is false for every suffix w; of w, and so, intuitively, treating FG as false still 
allows M(y, 0) to accept all words FGw but not FGy2. The automaton M (4, Ø) 
that treats Gwe as ff is shown in Figure 5. To observe the effect of “treating Gwe 
as ff”, consider state ~1 and the letter abc. If we used the function af as transition 
relation, then we would obtain the transition wv; me Fa V (Gu ^ Fb). Instead, 
since Gwz2 is treated as ff, we get v1 2^5 Fa. 

The second automaton is the intersection of two Mojmir automata. The first one 
is M(w2), the Mojmir automaton for v», which guarantees that the intersection only 
accepts words satisfying FGw2. The second one, which will be called M(w1, (v2]) 
in Section 5.2, is intuitively in charge of checking that a word w satisfies FGw1 
assuming that it satisfies FGw2. Both automata are shown in Figure 6. We choose 
M (1, (2)) as the Mojmir automaton for FG [Gw»/tt]. At first sight, since 
FG» and Gy» are not equivalent, replacing Gw2 by tt looks wrong. Let us see 
why it is correct. Since Gwe eventually holds, the assumption that Gv» is true can 
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Fig. 6: The automata M(w1, (v3)) and M(wv»). 


only be incorrect for a finite time, or, in other words, for a finite number of tokens. 
Now we observe that the acceptance condition of Mojmir automata is insensitive 
to the fate of a finite number of tokens: if almost every token eventually reaches 
the accepting states, then after changing the fate of a finite number of tokens this 
is still the case, and vice versa. So replacing Gwe by tt is correct after all. 

Consider state i1 of M(V1, (v3]). If we used the function af as transition 
relation, then we would obtain the transition yı “> Fa V Gw». Since we handle 
Gu» as tt, we get i1 —5 tt instead. 

We have thus constructed an automaton for FG(FaV (G(aV Fb) Ac)). To handle 
formulae FG» where w has multiple G-subformulae G1, ..., Gan, possibly nested 
within each other, we generalize the procedure above, and construct an automaton 
M(,GQ) for each subset G of G-subformulae. The automaton M(y,G) accepts all 
words w such that w = y and w E- FGw for every Gv € G. The automaton is 
an intersection of automata, one for each formula in G. The automaton for Gi 
handles the G-subformulae of v; that belong to G as tt. Observe that circularity 
assumptions of the form “the automaton for Gy assumes that FGwe2 holds, and 
the automaton for Gwe assumes that that FGyı holds" are not possible because 
no two formulae can be subformulae of each other. 

'The final point is to address the state-explosion problem. In the construction 
above, the final Mojmir automaton for a formula with G-subformulae Gw5,..., Gv 
is the union of 2" Mojmir automata, and has an unacceptably large number of 
states. Fortunately, we can construct all these automata so that they have exactly 
the same states and transitions, and only differ on their set of accepting states. 
The idea is to construct M(w,G) using a different transition function. We replace 
af by another function afg that behaves like af, except for G subformulae, where 
we set af (Gu, v) = Gy instead of af (Gy, v) = Gy ^ af (v, v). Intuitively, we 
leave the decision whether to handle Gy as tt or ff “open”. Then, for every set G 
we choose the accepting states appropriately: Since M(y,G) assumes that all the 
formulae of G are true, we choose as accepting states those whose corresponding 
formulae are propositionally implied by G. 

In our example, both M(1^1,0) and M(w1, {w2}) are the intersection of the 
two automata of Figure 7; they differ only in the accepting states. In the case 
of M(w1,0), the left automaton treats Gwe as ff, and the right automaton is 
redundant; therefore, the only accepting state of the left automaton is tt, and all 
states of the right automaton are accepting. In the case of M(w1, (v2]), the left 
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Fig. 7: Intersections with the same structure equivalent to M (1,0) and M(q1, {Y2} NAM (¢2). 


automaton on the left treats Gwe as tt, and the right automaton checks that Gwe 
holds; therefore, the accepting states of the left automaton are Fa V GF we and tt, 
and the only accepting state of the right automaton is tt. 


5.2 Logical Characterization 
In order to formalize the notion of “handling a subformula Gw as tt” we introduce 


the following definition: 


Definition 17 Let ọ be a formula and v € 24”. The formula af a (v, v) is induc- 
tively defined as af (v, v), with only this difference: 


afa (Gy, v) = Ge (instead of af (Gy, v) = af (p, v) ^ Gy). 
We define Reacha(y) = ([af a (o, w)]p | w € (24”)*}. 
Example 7 Let y = 1j Ua, where y = G(a ^ Xa). We have 


af ae, {a}) af al, {a}) Ay =p V^ 
af(e,(a)) — af( (a) ^e =p npg 


'The logical characterization theorem will be an easy corollary of Lemma 7 
below. Given a formula y and a word w, the lemma characterizes the set of G- 
subformulae of y that eventually hold at a word w, i.e., the subformulae Gw such 
that w = FGwv. If ọ is of the form FGy, then clearly w } ¢ iff the subformula 
G belongs to this set. 
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Definition 18 Given a formula y, we denote by G(y) the set of G-subformulae 
of o, i.e., the subformulae of y of the form Gw. Given a word w, we say that 
Gy € G(o) is eventually true in w if w = FGwv. We denote the set of eventually 
true G-subformulae of y by Gu(q). 


Definition 19 A set of G C G(q) is closed for w if G =p af (v, wij) holds for 
almost all i € N, almost all j > i, and for every Gy € G. 


The following lemma shows that eventually true G-subformulae can be charac- 
terized using the closed sets. 


Lemma 7 Let q be a formula and let w be a word. 
— Every set G C G(y) closed for w is included in Gu (qv). 
— Gu(y) is closed for w. 


Theorem 7 (Logical characterization theorem III) For every LTL formula 
FGy and every word w: w = FGwq iff there exists a closed set G C G(FGy) for 
w containing Gy. 


Proof (=): Assume w = FG y. Then o € Gu(FGy) and by Lemma 7(2) Gu(FGy) 
is closed for w. So we can take G = G,,(y). 

(<=): Assume some G C G(FGy) containing Gy is closed for w. By Lemma 7(1) 
we have Gy € Gu(FGy), and so, by the definition of Gu(FGy), we get w = FGy. 


Let us see that the theorem indeed generalizes Theorem 1. If y is a G-free 
formula, then G(FGy) = {Gy}. So the only possible choice for G is G = {Gy} 
and the only possible v is y = y. Further, we have 


G EP af a(t, wij) 
iff GY Ep af a (Y, wij) 
iff Ø Ep af (v, wij) (Gy does not occur in af G (v, wij) 
iff af (v, wij) =p tt 
iff af (Y, wij) =p tt (af (Y, wij) = af a (V, wij since y is G-free) 
So for a G-free formula y the theorem states that w = FGy iff af (p, wij) =p tt 
for almost every i € N and almost every j > i. 


Let us construct a Mojmir automaton for FGy from Theorem 7. The key is 
the following simple fact: 


af (o, wij) =p tt holds for almost every i € N and almost every j > i (*) 
iff 
for almost every i € N there exists j > i such that af a (v, wij) =p tt [A 


For the proof, notice first that (*) implies (**); for the other direction recall that if 
af a (v, wij) =p tt then afa (p, wij’) =p tt for every j' > j. 

Now, we observe that (**) has the form of the acceptance condition of a Mojmir 
automaton. Intuitively, we can reshape it into “for every token i € N there exists a 
time j € N such that af a (v, wij) =p tt”. So we define: 


Definition 20 Let y be a formula and let G C G(w). The Mojmir automaton of 
y with respect to G is M(y,G) = (Reachea (Y), v, af a, Fg), where Fg is the set of 
formulae y € Reachea (p) such that G Ep w 
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Fig. 8: Transition systems of the Mojmir automata for y = (Gv)U-a and for Y = a A X-a. 


As we announced earlier, only the set of accepting states of M (p, G) depends on 
G. The following lemma, proved in the Appendix, shows that M (v, G) is indeed 
a Mojmir automaton, i.e., that states reachable from accepting states are also 
accepting. 


Lemma 8 Let y be a formula and let G C G(ẹ). For every y € Reachea (p) and 
every v € 2°?, if G Hp v then G Ep af c (v^, v). 


Example 8 Let y = (Gw)U-a, where y = a ^ X^a. We have G(y) = {Gy}, and 
so two automata M(y,0) and M(y, (Gw)), whose common transition system is 
shown in Figure 8. We have one single automaton M (1^, 0), shown on the right of 
the figure. A formula 7)’ is an accepting state of M (y, Ø) if tt Hp v^; and so the 
only accepting state of this automaton is tt. The same holds for M(y, Ø). On the 
other hand, v is an accepting state of M(y, {Gw}) if Gv = v’, and so both Gv 
and tt are accepting states. 


As a corollary of Lemma 7 and Definition 20 we obtain: 


Corollary 1 Let p be a formula, w a word, and G € G(y). 


— If for every GY € G we have w € L(M(wv,G)), then for every Gy € G we have 
w H= FGy. 

— If for every GY € G we have w = FGy, then for every GY € Gulp) we have 
w € L(M(v, Gu(y)))- 


Moreover, as a particular case: 


Theorem 8 Let FGy be a formula and let w be a word. Then w E FGy iff there 
is G C G(FGo) containing Gy such that w € L(M(4%,G)) for every Gv € G. 


5.3 The Product Automaton 


Theorem 8 allows us to construct a generalized Rabin automaton for an arbitrary 
FG-formula FGy. 


Definition 21 Let o = FGy be a FG-formula, and let G(y) be the set of G- 
subformulae of y. For every formula Gy € G(o), let R(W,G) = (Qu, dou, du, Acc.) 
be the Rabin automaton obtained by applying Definition 16 to the Mojmir au- 
tomaton M(w,G). (Recall that Qy, gow, and ôy do not depend on G.) 

We define the generalized Rabin automaton automaton (t) as 
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Rẹ) = II Qu, ane, II dow; Il y, Acc 


GveG(y) GweC(y) GeG(o) 


where the accepting condition Acc, which expresses “some G C G(q) containing 
Gy is closed”, is given by 


Acc := V A Acct, 


{GCG(~)|GxEG} GYEG 


Since each Acc$, is a Rabin condition, Acc is a generalized Rabin condition. R(y) 
can be transformed into an equivalent Rabin automaton using the construction of 
Section 2.3.1. Notice however that, as shown in [CGK13], for many applications it 
is better to keep the generalized Rabin condition. 


Theorem 9 Let y be a FG-formula and let w be a word. Then w = v iff w € 
L(R(¢)). 


Proof Assume y = FGy. By the definition of its accepting condition, R(y) accepts 
a word w iff there is a set G C G(y) containing Gy such that (v, 7) accepts w 
for every Gy € G. By Theorem 6, this is the case iff M(w,G) accepts w for every 
Gy € G. By Theorem 8 this is the case iff w = v. 


6 DRAs for Arbitrary Formulae 


In order to explain the last step of our procedure, let Ap = {a,b,c} be a set 
of atomic propositions, and consider the formula y = b V XGw over Ap, where 
p = a V X(bUc). Following the ideas of the previous section, we try to construct 
an automaton for y as the union of 


(i) an automaton M(y,@) accepting all words satisfying y but not FG (plus 
possibly other words satisfying y), and 

(ii) an automaton M(y, (V )) accepting all words satisfying y and FGw (plus 
possibly other words satisfying v). 


By the same argument we gave in the previous section, for M(p, Ø) we can take a 
Mojmir automaton accepting the words satisfying y[Gw/ff] = b v X = b. We now 
try to construct M(y, (]) as the intersection of two Mojmir automata: M(w), 
which guarantees that the intersection only accepts words satisfying FGy, and an 
automaton that accepts the words satisfying y under the assumption that they 
satisfy FG. The automaton M(w) is shown on the right of Figure 9. But what 
can the other automaton be? 

We consider the following idea. As transition system of the automaton we take 
TT (c) (see Definition 7). This guarantees that the state reached after reading a 
finite word wo; is af (y, woi). Further, we choose a co-Büchi accepting condition 
stating that states y’ € Reach(y) that do not satisfy Gy p v' occur only finitely 
often in the run. Then, an accepting run on a word wv gets eventually trapped in 
states satisfying Gy Ep v'. So, since Gw eventually holds, for a sufficiently large 
i we have wi E- af (p, woi), and so by Proposition 2 we have w E y. 
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Fig. 9: Automata B(y) and M(w) for y = b v XGy and y = a V X(bUc). 


Unfortunately, while this reasoning is sound, it is not complete. In our example, 
this idea leads to the automaton B(y) shown on the left of Figure 9. Since we have 
Gy Ep tt and Gy Ep Gy, the accepting states of B(y) are qi and qo. Consider 
the word w = abc (abc)”. We have w = ¢, but the run for w starts at qo, moves to 
q2, and then moves to q3 and stays there forever. So w is rejected. The point is 
that neither G = Ø nor G = {Gy} satisfy G Ep qs. 

In the rest of the section we show that this is, however, nearly correct. We 
construct a correct automaton with the same states and transitions as the one 
above, but with a modified accepting condition. For this we first interpret this 
failed attempt in logical terms. 


6.1 Logical characterization theorem 


Our failed attempt amounts to, given a word w, checking if there is a closed set G 
for w satisfying G Ep af (v, woj) for almost every j € N. The following proposition 
summarizes our observation that this condition does not characterize the words 
satisfying q. 


Proposition 6 Let p be a formula and w a word. If there exists a set G C G(q) 
such that (1) G is closed for w and (2) G Ep af (p, woj) for almost every j € N, 
then w = v. However, the converse does not hold. 


Proof Assume such a G exists. Since G is closed for w, by Lemma 7(b) we have 
w E- FGy for every Gy € G, and so there exists an index i € N such that w; = G 
for every j > i. By (2), we have G Ep af(p, woj) for some j > i and hence 
wj E- af (p, wo;). Finally, by Proposition 2, w E v 

The converse does not hold due to the previous example where neither G = () 


nor G = {Gy} satisfy G Ep af (v, woz). 


In the rest of the section we weaken condition (2) of Proposition 6 so that the 
converse also holds, thus yielding a logical characterization theorem that generalizes 
Theorem 7. More precisely, our goal is to find an adequate formula F(G, woj) such 
that after replacing condition (2) by 
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(2*) G ^.F(Q,woj) Ep af(y, woj) for almost every j € N. 


both Proposition 6 and its converse hold. Observe that we replace G by the stronger 
formula G ^ F(G, woj), which makes the propositional implication easier to satisfy. 


6.1.1 A first candidate for F(G, woj) 


The formula F(G, woj) should satisfy wj = F(G, woj) for almost every j € N, 
because then we can still prove that (1) and (2*) imply w |= ¢ using the same 
proof as in Proposition 6. So we search for a formula satisfying this condition. 

Let us examine the closure condition in more detail. Given Gy) € G, it states 
that for almost all i € N we have G Fp af (v, wij) for almost all j > i. So there 
is a smallest index i such that G Ep af (v, wij) holds for almost every j > i. We 
give it a name, and define a first candidate for F(G, woj). 


Definition 22 Let y be a formula and let w be a word. Let G C G(w) be closed 
for w and let Gy € G. The threshold thr. (1), G) of Y in G is the smallest index i 
such that G Fp af (v, wjx) holds for every j > i and almost all k > j. Further, 
we define 


j 
Fi(t,G,wos)= N falt, wis) 


i=thru (%,G) 


Fı(G, woj) = N Fag, woj) 


GyEgG 


Recall that wi; = € if i > j by definition. Since af a (v, €) = Y, we can also define 


V if 7 =0 
Fil, G, woj) = ee Nas, oe) Pas wij) ifj > 0 


Example 9 Consider the formula p = b v XGy and v = a V X(bUc) and let 
G = {Gy}. For w = (abc)? we have af q(v, wij) = tt for every 0 € i < j. So G is 
closed for w. Further we have thry(#,G) = 0, and so Fı (p, G, woj) = v for every 
j 20. 

For w = (abc)? we have af a (V, wi(;..1)) = bUc for every i > 0, and af a (v, wij) = 
tt for every j >i+1> 1. So G is closed for w. Further we have thru (i, G) = 0, 
and so 


. jv if 7 =0 
A HON) = ee if j >0 
For w = abc(abc)^ we have af a (v, woj) = bUc for all j > 0 and af G(v,wij) = 
tt for all other pairs j > i. So G is closed for w. Further we have thrw(w,G) — 1, 
because GY [E p af a (1), woj) = bUc for all j > 0. So Fi(w,G, woj) = v for every 
j 20. 


Let us prove that our first candidate indeed satisfies w; = G ^ J1(€,wo;) for 
almost every j. 


Lemma 9 Let o, w, G and Gy as in Definition 22. Then wj = GA Fi(G, wo;) 
for almost every j € N 
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Proof By the semantics of LTL, there exists an index k such that for every Gw € 
G(y) either wi E- Gv or wp A FGy holds. We say that G(y) stabilizes at k. By 
Theorem 7, we further have wj, = G. So w; |= G for every j > k. We now show that 
wj = af c (v, wij) holds for every j > k, every Gv € G, and every i > thru(v, €), 
which concludes the proof. We consider two cases. If G Ep af G (v, wij) holds, then 
the claim follows from wy = G. If G |£ p af a (v, wij) then, since i > thru(v, C), 
there exists j’ > j such that G Ep af q (V, wij). Since j' > k, we have wy = G, 
and so wy E afag(v,wiy) = af g(af (v, wij), wjj'). It remains to show that 
wj F af (af s (b, wij), wjj') implies wj = af s (v, wij). The proof is by structural 
induction on the structure of s». All cases are identical to those of Proposition 2, 
with the exception of Y = Gy’. If y = Gy" we have af a (af (V, wij), wii’) = 
af a (V, wiz) = Gy’, and so we have to prove that wj = Gy’ implies w; = Gv. 
Since j' » j, this does not seem at first to be the case, but recall that we have 
j’ > j > k by hypothesis; since G(w) stabilizes at k, the two suffixes wj; and wj 
satisfy the same formulae of G(q), and we are done. 


Unfortunately, our first candidate is not good enough for a logical characteriza- 
tion: we can find a formula y and a word w such that w = ọ but no set G satisfies 
conditions (1) and (2*). 


Example 10 Let p = Gy, where y = Xa V Gb, and w = a”. We have w E- e. The 
only non-empty set closed for w is G = {y}. However, for this G condition (2*) 
does not hold. Indeed, we have 


afal, wij) 2- aV Gb for every j 2 i41 
af a (v, wij) = tt for every j 5 i41 
af(q,woj)— ve ^a  forevery j 2 i21 


and so (2*) holds only if y ^ (a V Gb) Ep y ^a, which is not the case. 


6.1.2 A second (and correct) candidate 


Observe that, intuitively, if both (1) and (2*) hold, then w satisfies y even if it does 
not satisfy any of the formulae of G = G(v) \ G. Using this, we show that Lemma 
9 still holds if we strengthen F1(G, woi) by, loosely speaking, replacing occurrences 
of formulae of G by ff. Let us define this formula F(G, woi), our final candidate. 


Definition 23 Let y, w, G, and Gy as in Definition 22, and let G = G(q) X G. We 
define 


F (v, G, woj) = Fa (Y, G, wo;)|G/ff] p 
F(G,wo) 2 N F(v,G,woi) 


Gvec 


Example 11 In Example 10 we have G = {py}, hence G = {Gb}. So F(w, woi) = 
(a V Gb) |{ Gb} /ff] p = a, and now condition (2*) holds. 

For the three words of Example 9 we have G = G(q), and so (v, woi) = 
Fy (v, woi). 


Lemma 10 Let o, w, G and Gy be as in Definition 22. Then wj = G A F(G, woi) 
for almost every j € N. 
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Proof The proof is analogous to the proof of Lemma 9 and additionally relies on 
the following equivalence, which can proven by a straightforward induction on wv. 


G Ep af a(V[G/ff]| p, wo;) iff G Ep afa(v; woi)|G/ff] P 


We show that the new candidate indeed yields a logical characterization theorem. 


Theorem 10 (Logical characterization theorem IV) Let y be a formula and 
w a word. Then w = qv iff there exists G C G(y) satisfying (1) G is closed for w, 
and (2*) G ^ F(G,woi) =p af (v, woi) for almost every i € N. 


Proof (<=) By (1) and (2*), we have w; = G A F(G, woi) and G A F(G, woz) =P 
af (p, woj) for almost every j € N, which implies w; = af (p, woj) for almost every 
j € N, and therefore w = v. 

(=) Assume w E y. Let Gy be the set of all formulae Gy € Gy such that 
w = FGw. Then by Lemma 7, Gw satisfies (1). For (2*), we first consider the 
special case in which thr, (v, GC) = 0 holds for all G4 € Gw, that is, we not only 
have w = FGw« but even w = Gy for every v € Gw. Then, by the same reasoning 
as in the proof of Theorem 7, we obtain that Gy Ep af G (qv, woj) holds for almost 
all j € N. So, after unfolding the definition of F (Gy, woz), it remains to show that 
for almost all j € N: 


af a (e, woj)Gu/ff]p ^ A (con A dfe Ec» Ep af (o, woj) 
i—0 


G4 €9,, 


which is proven by a straightforward induction on y. We consider only two sample 
cases: 


- p =a. Since y = a does not have any G-subformulae, the conjunction over all 
Gw on the left hand side is simply tt and also the propositional substitution 
has no effect. After simplification we obtain af g(a, woj) Ep af (a, woj) which 
is true. 

- p= Gy’. In the case Gy’ ¢ Gu, the left-hand side is propositionally equal to 
ff and hence the claim holds. Thus assume Gy’ € Gw. Let us now examine the 
right-hand side: 


j 

af (Ge, woi) = Ge ^ N af(e' wis) 
0—i 
Since Gy’ € Gw, the first conjunct is implied by the left-hand side. Let now 
af (y’, wij) be an arbitrary conjunct of the right-hand side. Then there is 
a matching af a (v, wij)[Gw/ff]p on the left-hand side. We now apply the 
induction hypothesis on this pair and obtain that af (y’, wij) is propositionally 
entailed by the whole left-hand side. Applying this idea to all conjuncts yields 
the claim. 


Let us now consider the general case. Let k be the maximum of thru (1^, G) for 
elements of Gw. Then we have wy H Gy for every Y € Gw. Let y’ = af (v, wor). By 
Proposition 2, we have wz = y’, and we can apply the reasoning above to obtain: 
for almost every i € N: G A Fu,(G,wori) Ep af (e, wri). Since F(G, wocr+i)) 
contains all conjuncts of Fw,(G, wei), after unfolding the definitions we finally 
obtain G ^ F(G, woi) Ep af (v, woi) for almost every i € N. 
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6.2 From the logical characterization to automata 


As in the previous section, we transform the logical characterization into an 
automaton. For this, we show that F(G, woi) is closely related to the ranks at 
which the automata JM (15, G) accept the word w. Loosely speaking, the fact that 
these automata accept tells us that the formulae of G eventually hold, and the 
ranks at which they accept allows us to determine the formula F1(G, woi) — and 
hence also F(G, woi) — for sufficiently large i. We need a preliminary definition. 


Definition 24 Let M be a Mojmir automaton with set of states Qm, and let 
sr: Qm — N be a state-ranking that assigns to each state q € Qm a rank sr(q). 
For every k € N, we define 


S(sr, k) = {q € Qm | sr(q) = k} 


In words: S(sr, k) is the set of states that have rank at least k in the state-ranking 
sr. 


Example 12 For a state-ranking 


qo q1 q2 93 q4 q5 q6 
(21143.11) 


we have for example S(sr,1) = {q0,q1,q3,q4}, and S(sr,3) = {q3, q4}. For the 
bottom state of the DRA in Figure 4 (which is a state-ranking of the Mojmir 
automaton on the left of the figure) we get S(sr,1) = {a V (bUc),bUc} and 
S(sr,2) = {a V (bUc)}. 


We can now state the theorem. Recall that the Mojmir automaton M (Y, G) was 
defined in Definition 20, and that the states of its corresponding Rabin automaton 
R(w,G) are state-rankings for the states of the Mojmir M (4, G). 


Theorem 11 Let G C G(y) be closed for w, and let Gy € G. For every i > 0, let 
sr(i) be the state of R(w,G) reached after wo; (in other words, sr (i) = ôy (qoy, woi), 
where ôy, is the transition function of R(w,G)). Finally, let r be the smallest rank 
at which R(w,G) accepts w. Then 


G ^ Fily, G, woi) =p GAS(sr(i),r) for almost every i € N. 
Before proving the theorem, let us consider an example. 


Example 13 Figure 10 shows the transition system 7 (v), the Mojmir automaton 
M (v), and the DRA (v) for the formula y = b v XGy with y» = a V bUc (cf. 
Figure 9). The state (i,j) of (v) indicates that ~ has rank i and bUc has rank j. 
We have 


merge(1) = 0) succeed(1) = {t1, ts, t7} 


fan 10s, e) merge(2) = (tc) ^ succeed(2) = (ta, tz, tg} 


We examine again the three words of Example 9. 


Let w = a”. The run of (v) on w is tf, and so R(w) accepts w at rank 1. 
Recall that Fı (Y, G, woi) = w for every i > 0. So we have 


GA Ji1(G,woi) = Gv ^ for almost every i € N 
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Fig. 10: Transition system 7 (p) and automata M(w), and (v) for o = b v XGy and 
y =a V X(bUc). 


Further, since S(sr(i),1) is the conjunction of the states q of M(y) such that 
sr(woi,q) 2 1, and the run of & (v) on w only visits (1,.L), we have sr(z) = (1, L) 
for every i > 0, and so S(sr(i), 1) = qı = v. We get 


G^dS(sr(i),1) 2 Gy ^v for almost every i c N 


which is indeed propositionally equivalent to G ^ Fi(G, woi). 


Let now w = c”. The run of R(w) on w is tats, and so &(v) accepts w at rank 
1. But now we have Fı (Y, G, woi) =p v ^ (DUc) for every i > 2, and so 


GA F(G,woi) = Gy ^y ^(bUc) for almost every i € N 


Since the run of (v) on w gets trapped in state (2, 1), we have S(sr(z), 1) = pAbUc 
for almost every i > 2, and so 


G^S(sr(i),1) = GV Av A (bUc) for almost every ic N 


Finally, let w = abc abc”. The run of (v) on w is tet7Z, and so (v) accepts 
w at rank 2 and not at rank 1. We have J1(v, C, woi) = v for every i > 1, and so 


GAFi(G,woi) = GwW ^w for almost every i € N 


Further, since the run of R(w) on w gets trapped in state (2, 1), we have S(sr(i), 2) = 
w for almost every i > 0, and so 


G^S(sr(i),2) = GU ^v for almost every i € N 


Before proving the theorem we have a closer look at the succeeding tokens of 
a Mojmir automaton. Assume that a Mojmir automaton accepts a word, and we 
are given the rank at which the word is accepted. The following lemma (proved 
in the Appendix) shows that from some moment on whether a token succeeds or 
not depends only on its birthdate, its current rank, and its current state. Most 
importantly, all young enough tokens will succeed. 
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Lemma 11 Let M(w,G) be the Mojmir automaton for a formula v. Assume 
M(w,G) accepts a word w at the smallest accepting rank r. For almost every t € N 
and for every token T of the run of M(w,G) on w, the token succeeds iff 


dy TO Lor 
2. sry (t, runy(T,t)) 2 r, or 
3. runw(r,t) € F. 


The proof of the Theorem is based on the crucial insight that each af a (Y, wrt) 
precisely corresponds to the state that token 7 occupies at time t. 


Proof (Of Theorem 11) Consider the run of M(w,G) on the word w. Let t be large 
enough so that 


— every token 7 succeeds iff one of the three conditions of Lemma 11 holds, and 
— all tokens 7 < thru (v, G) that succeed have already reached the set of accepting 
states of M(w,G). 


Let m > t. We prove G A Fi (Y, G, wom) =p G ^ S(sr(m),r). 


(=>): GA Fily, G, wom) Ee € ^ S(sr(m),r). 

By definition we have S(sr(m),r) = (q € Qaiu,g) | Srw(m, q) > r}, and so it 
suffices to show that G Ep q or Fi (Y, G, wom) Ep q holds for every q € S(sr(m), r). 
Assume G [£p q. We prove Fi(~,G, wom) EP q. 

We position ourselves at time m: when we talk about the rank or the state of a 
token we mean its rank or state at time m. Since sr, (m, q) > r, in particular the 
state q is ranked, and so every token on state q has rank srw(m, q). Let r be any of 
these tokens. By our choice of t, and since t < m, all tokens with rank greater than 
or equal to r succeed. So 7 succeeds. Moreover, since G Fp q, the state q is not an 
accepting state of M(w,G), and so T has not succeeded yet. So 7 will eventually 
reach the accepting states of M(w,G) in the future. Moreover, by our choice of 
t, all tokens born before thry(w~,G) have already reached the accepting states. 
So we have T > thrw(,G), and so, by the definition of Fı(Y,G, wom), we get 
Fi(w,G, wom) =P af e (V, wrm) (notice that 7 < m because we assume that token 
T was already born at time t). By the definition of the transition system of M (4, G), 
the equivalence class [af a (v, Wrm)|p is precisely the state of M(w,G) reached by 
token 7 at time m, that is, q = [afa (v, wz-m)|p. So F(,G, wom) FP q. 


(€): GA S(sr(m),r) Ee € ^ Fily, G, wom). 

By the definition of Fı it suffices to show that the left-hand-side implies 
af a (b, Wim) for every thru(V, 9) € i € m. Without loss of generality we assume 
GK af (v, wim). Consider the token created at time i. Since it is created after 
time thru (V, G), it will eventually reach the accepting states by the definition of 
the threshold and succeed. Furthermore, since i < m, one of the three conditions 
of Lemma 11 with t = m and 7 = i holds. Since 7 cannot satisfy conditions 
(1) or (3) (G KF af a (v, wim)), it must satisfy condition (2). So the rank of the 
state run, (1, m) at time m is at least r, and so it belongs to S(sr(m),r). But the 
state run; (4, m) is the state reached by token i at time m, and so it is equal to 


[af (9, wim)]P. So G ^ S(sr(m),r) Ep af a (o, wim). 
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6.3 The automaton A(y): Informal definition 


Let us first recall the structure of the DGRA R(FG 7) for a FG-formula. It 
is the union of DGRAs R(G), one for each subset G C G(y) containing Gv. 
Given a set G = (Gwy,..., Gy.) of G-subformulae, R(G) accepts all words w 
satisfying y and FGy,...,F Gv. It is defined as the intersection of the DRAs 
R1, 9), ..., R(Yn, 9), which have all the same transition systems (i.e., the same 
states, transitions, and initial state), but differ on their accepting conditions. Recall 
that each 7(v;;, G) can accept at different ranks (as many as the number of accepting 
pairs in (vi, G)). 

Given an arbitrary formula y, we also define its DGRA A(y) as a union 
of DGRAs. However, the union now contains an element R(G,r) for every set 
G = (Gv5,..., Gyn} € Gy), and for each possible vector r = (r1,...,ra) of 
accepting ranks of (u1,0),..., R(Yn,G). For example, if n = 2 and R(v1,€) 
and R(Y2, G) have 3 and 2 accepting pairs, respectively, then instead of one single 
DGRA R(G) we have six DGRAs R(G, (1, 1)),..., RG, (3, 2)). 

The transition system of R(G,r) is the product of the transition system 7 (y) 
and the transition system of R(G). Recall that 7 () has Reach(q) as set of states, 
and af as transition function. Since, in turn, the transition system of R(G) is the 
product of the transition systems of R(w1,G),...,R(~n,G), a state of R(G) isa 
tuple (sri,...,5rn), where sr; is a state-ranking of the formulae of Reach (wi), 
and a state of R(G,r) is a tuple (x, sr1,..., $r5)), where x € Reach(y). 

It remains to describe the accepting condition of R(G,r). We say that (€) 
accepts at rank-vector r = (r1,...,r4) if each R(vi, G) accepts at rank r;. Our goal 
is to design the accepting condition as a conjunction of two conditions guaranteeing 
that: 


(i) G is closed (which implies that R(G) accepts), and moreover (C) accepts at 
rank-vector r, and 
(ii) R(G,r) eventually stays within states (x, sr1,..., Srn) satisfying 


G ^S(sri r1)|g/ff] P ^ -- ^ S(srn,rn)[G/fflp FP x 


In particular, (i) checks condition (1) of the logical characterization theorem, 
Theorem 10. Let us now see that (ii) checks condition (2*). By definition, the 
formula x reached after reading a finite prefix wo; of a word w is the formula 
af (p, woi). Therefore, (ii) is equivalent to 


GA^(S(sr1 (woi, r1)) ^: -AS (srn (woi,ra)))(G/ff] P Ep af (p, woi) for almost every 


which by Theorem 11 is equivalent after propositional substitution of G with ff on 
both sides to 


G ^ JF (v, G,woi) Ep af(y,woi) for almost every i € N 


and so to condition (2*) of the logical characterization theorem. 
We still have to express (i) and (ii) as generalized Rabin conditions. Condition 
(i) is a conjunction of conditions expressing that R(wi,G) accepts at rank r; 


for every 1 € i € n. Let Pi V --- V P, be the accepting condition of (vi, G). 
Recall that (v;,G) accepts at rank r; if it accepts with the Rabin pair P». 


Py, V Pr,41 V ++ V Pn. Further, condition (ii) is a co-Büchi condition, which is a 
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special case of a Rabin condition. So the conjunction of (i) and (ii) is a conjunction 
of Rabin conditions, and so a generalized Rabin condition. 

Observe that condition (i) can be decomposed into a conjunction of conditions, 
each of which concerns only one of the automata in the product. On the contrary, 
condition (ii) involves all components of the product, and cannot be decomposed. 

As in the case of FG-formulae, it remains to deal with the state-explosion 
problem. Recall that, when we introduced the automata (v, G), we observed that 
they can all be constructed so that they all have the same transition system, and 
therefore the intersection R(G) has the same transition system as well. Since R(G) 
and R(G,r) have the same transition system, the same happens now. 


6.4 The automaton A(y): Formal definition 
We conclude the section by giving a precise definition of the automaton A(y). 


Definition 25 Let y be an arbitrary formula, and let G(y) = (Gw,..., Gun] 
be the set of G-subformulae of y. For every formula Gy; € G(y), let (vi, G) = 
(Qi, 24”, qoi, 0i, Acc?) be the DRA obtained by applying Definition 16 to the 
Mojmir automaton M (p;i, G). Recall that a state of Q; is a state-ranking of the 
states of M (pi, G). We use sr; to denote a state-ranking of Qi. 

The DGRA A(y) = (Qv, 24”, qoy; dy, Acco) is defined as follows: 


- Qy = Reach(y) x Q1 X- X Qn. 

- qop = (P, 901, ---, Gon). 

- óo((x, sr1,..., Srn) a) = (af (x, a), 61(s71, a), ..., On (sra, a)). 

- Acc, is a disjunction containing a disjunct Accf for each pair (G,r), where 
G C G(y) and r is a mapping assigning to each v € G a rank, i.e., a number 
between 1 and the number of Rabin pairs of (v, G); each Acc§ is then of the 
form 

Mg ^ A Acc? (V) 
Gvyeg 


where Acc? (7) denotes the Rabin pair of (V, G) with number r(Y), and MY 
says that transitions taken infinitely often by A(y) must lead into the following 
set: 


(G6 sri. sr) € Qe|1 G^. A. S(srur(v:))g/ff]lp Ep x) - 


Gvy;ccg 


Observe that MY can be phrased as a co-Büchi condition on transitions. There- 
fore, the whole condition Accy is a generalized Rabin condition. 


Example 14 Recall Example 13 illustrated in Figure 10. The states of A(y) are 
pairs (x, sr), where x is a state of 7 (p) (on the left of the figure) and sr is a state 
of R.(4)) (on the right). Rank vectors have only one component, and so we write r 
instead of r. Since R(w) has two Rabin pairs, we have r = 1 or r = 2. 

For G = @ we have Acc! = M®?, and, independently of r, condition M? requests 
that A(y) eventually stays in states (x, sr) satisfying tt = x, and so in the set 


{ (q1, sro), (q1, sri) }. 
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For G = (v) we have Acc? = MË ^ Acc’. Condition Acc? states that RP) 
must accept using the pair P(r). Let us now examine M# and MŽ, starting with 
the latter. 

M3 requests that A(y) eventually stays in states (x, sr) satisfying Gy ^ 
S(sr, 2)[C/ff]p p x. Since S(sro,2) = tt and S(sr1,2) = v (see the Mojmir 
automaton in the middle of the figure), A(y) must eventually stay in states (x, sro) 
satisfying Gy Ep x or states (x, sr1) satisfying GA ^v Ep x, and so in the states 
(a, 22) x (sro, sr}. 

MY requests that A(p) eventually stays in states (x, sr) satisfying Gw ^ 
S(sr, 1)[C/ff] P. =p x. Since S(sri,1) = (po,p1) = v A (bUc), we have Gi A 
S(sri, 1)G/ff] P Ep x for x = Gv ^ (bUc), the formula of state qs. So A(y~) must 
eventually stay in the set ((q1,q2) x (sro, sr1}) U {(q3, sr1)}. 


We now proceed to our final result. 
Theorem 12 For any LTL formula o, L(A(y)) = L(y). 


Proof (=) By Theorem 10 we only need to prove that if A(y) accepts w with 
G C G(o) and rank vector r, then (1) G is closed for w and (2*) GA F(G, woi) Ep 
af (p, woi) holds for almost every i € N. By construction A(y) only accepts with 
closed G’s and thus (1) holds. For (2*) we observe that A(y) also accepts w with 
the rank vector r* that maps every element of G to the smallest accepting rank for 
w. So we obtain from M£: 


g^ ÁN S(srur' Qx))[g/ff]p Ee af (o, woi) 
Gvyiccg 

By Theorem 11 we have GA S(sr;,r) Ep GAFi(G, woi) for almost every i € N, 
and by propositional substitution of G with ff on both sides we conclude that 
property (2*) holds. 
(<=): Let G € G(q) be a set satisfying the conditions of Theorem 10, and let r 
be the rank vector that maps every element of G to the corresponding smallest 
accepting rank. We now prove that A(y) accepts w with Acc¥. Since G is closed 
for w, the Rabin pairs Acc? (y) are accepting for all Gy € G. Hence it remains to 
show that also MY is accepting. For this we use the other direction of Theorem 11, 
i.e., that GA Fi(G, woi) Ee € ^S(sri,r) for almost every i € N, and propositional 
substitute G with ff on both sides. 


7 Optimizations 


The construction described in the previous sections can be optimized in a number 
of ways. In fact, we have already presented an important optimization: the fact 
that sink states are not ranked. It is possible to handle sinks just as any other 
state, but this leads to much larger Rabin automata. Even the toy examples of the 
paper would then be too large to be drawn. 

We implemented further optimizations reducing the number of states or the 
size of the accepting condition of the automata. Some, but not all, have been 
mechanically proven. The effect of the optimizations can be seen on examples in 
Tables 3 and 5. 
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7.1 Reducing the state space 


The first obvious reduction is to construct only the states reachable from the 
initial states. Further, we merge equivalent states in several ways. Interestingly, 
this happens based on the formulae that label the states, and not on the graph 
structure of the automaton, as is the case for, e.g., simulation-based reductions. 


1. Unfolding formulae. 
Let the one-step unfolding Unf of a formula be inductively defined by the 
following rules: 


Unf(a) =a Unf(Xp) = Xo 
Unf (~a) = 7a Unf (Fy) = Unf(y) V Fe 
Unf(y ^ v) = Unf (p) A Unf (a) Unf(Gy) = Unf (Y) ^ Go 


iinf(y V Y) = Unf (p) V Unf)  Unf(pUY) = Unf (wh) V (Unf (y) ^ (pUY)) 


The optimization consists of always using unfolded formulae as states. Note that 
af (Unf (p), -) = af (y,-) since af is Unf followed by plugging in the valuation 
read. Therefore, the only change in the transition system of the automaton 
is to merge states labelled by yi 4 p2 such that Unf(y1) = Unf(ye2). This 
is an efficient way to under-approximate LTL equivalence by propositional 
equivalence, which is also easier to check (PSPACE vs. NP), e.g. using BDDs. 
As a simple example, the optimized automaton for FGa has one state, instead 
of two states, as illustrated in Fig. 11. 


l p l 
Chel (eemp o. wen) 


a 


Fig. 11: Original and optimized co-Biichi automata for FGa 


2. Different initial states for DRAs. 

Since no finite prefix influences acceptance of Rabin automata for FG-formulae, 
introducing arbitrary initial states for them does not change the accepted 
language. Therefore, instead of using “transient” states, which cannot be visited 
once left, we try to use states that are reachable even after reading some 
prefixes. For instance, consider the formula GF((a ^ XXa) V (2a ^ XX-a)). 
'The automaton corresponds to a buffer keeping track of several last letters read. 
Without the optimization, we start with an empty buffer; such an initial state of 
the Rabin automaton has only a single token in the initial state of the Mojmir 
automaton. Then we read a letter and move to a buffer filled with either a or 
à. In the next step, we move to a buffer with two letters and from that point 
switch only among the two-letter buffers. The total size is thus 2° + 2! + 2? = 7. 
However, if we start with an already full buffer (filled with whatever letters), 
the acceptance is not affected, but the reachable state space is only of size 
2 =A, 
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Fig. 12: A co-Büchi automaton for GF((a^XXa)v (2a^XX-a)) and the optimized automaton 
inside the grey area (with an arbitrary initial state) 


3. Irrelevant DRAs. 


Recall that a state of our parallel composition is an array of formulae, one 
corresponding to the current state of the co-Büchi automaton, and the others 
to the states of the DRAs. We say that a DRA is irrelevant at a state if its 
corresponding G-formula either does not appear inside the current formula of 
the co-Büchi automaton, or it only appears in conjunction with another formula 
without any occurrence of G. For instance, after reading a in a A Fb A FGc V 
~a ^ FGd, the co-Büchi automaton reaches the state Fb A FGc, where the DRA 
for the formula d is irrelevant. Consider now Fb ^ FGc. At this state the DRA 
for c is irrelevant, due to the conjunction with Fb. Intuitively, the co-Büchi 
automaton waits for a b, and only after that it is important to monitor the 
satisfaction of FGc. Indeed, postponing the monitoring by finite time does not 
affect acceptance, similarly to the previous optimization. Moreover, if b never 
holds, then it is unnecessary to check satisfaction of FGc. 


7.2 Reducing the acceptance condition 


All disjuncts of a generalized Rabin condition are of the form (F, Apex Ix), which 
we call a generalized pair. We consider a transition-based condition and denote the 
set of all transitions by T. We remove generalized pairs that cannot be satisfied, 
as well as those whose satisfaction implies satisfaction of another pair. In order 
to detect such pairs, we first simplify them. The optimizations are performed to 
exhaustion in the following order. 


1. 


Remove every generalized pair (F,Z) such that F = T. 

Such pairs never accept, since the whole T cannot be avoided. 

Replace every generalized pair (F,Z ^ I) such that IU F = T by (F,Z). 

If F is visited only finitely often then T \ F C I is visited infinitely often. 
Replace every generalized pair (F, Ac y Ix) by (F, Aner Ix NF). 

Visiting F infinitely often excludes acceptance. 

Remove every generalized pair (F,Z ^ 0). 

'The empty set cannot be visited (infinitely often). 

Replace every generalized pair (F,Z ^ I ^ J) such that I C J by (F|Z ^I). 
If I is visited infinitely often then so is J. 
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6. Remove every generalized pair (F, Ape g Ik) for which there exists (F", \y ex: Tir) 
such that F’ C F, and for each k’ € K’ there is k € K such that Ij C Ij. 
A run accepted by the unprimed pair is also accepted by the primed pair. 


For example, consider the formula (GF(a ^ Xb) v FG(b v X-a)) ^ (GF(b ^ 
Xc) V FG(le V Xa)) ^ (GF(b ^ XXa) V FG(~c V X-b)). We start with 4568 pairs 
and after each phase we are left with 4052, 3715, 1997, 131, 122, and finally 12 
pairs, respectively. 


8 Complexity Bounds 


Before discussing the implementation of our construction and experimental results 
we briefly discuss the worst-case complexity and compare it with that of Safra-based 
constructions. 

Recall that the smallest DRA for an LTL formula of length n may have O(2?”) 
states. This is the case even for the fragment of LTL containing only conjunction, 
disjunction and the F-operator [AT04, Theorem 3.8]. Indeed, the paper shows that 
all DRAs for the formula 


T, 
F N (a; V Fbi) 
i=1 
have a double exponential number of states (in n). This lower bound is essen- 
tially matched by LTL-to-DRA translations based on Safra’s construction. These 
translations first transform y into a NBA of size O(2"), and then apply Safra’s con- 
struction, which runs in m9"? time and space, for an automaton of size m [Saf88]. 
The overall complexity is thus 


gr O(2") _ 90 (2 teen) 


Besides, the number of Rabin pairs of Safra-based translations is at most O(m) — 
O(2"). 

In our translation of a formula y, the set of states of our co-Büchi automaton is 
Reach(«), and the set of states of our DRAs are state-rankings over Reachea (v) for 
subformulae ~ of y. By Lemma 1, if y has n proper subformulae then both Reach(y) 
and Reache(w) have size at most 27". Since a state-ranking is a permutation of 
Mojmir states, the resulting DRA contains in the worst-case all permutations. 
Hence the number of states in the product (co-Büchi automaton and at most n 
DRAs) is at most 


2002") 


yo" reo 


Further, each pair corresponds to DRAs accepting at one of less than 22” ranks, or 
not accepting at all. Altogether, there are at most (0979 = 92907 pairs. 

We conjecture that there is a family of formulae for which our construction 
indeed produces automata of triple exponential size, although we have not yet been 


able to find one. 


Consider now the LTL fragment with syntax 


An=AAA|AVA|GFa | FGa 


a :—aj|-a|eo^a|ova 
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where a € Ap. This fragment contains many interesting fairness formulae, like those 
of the family A7 , (GF a; — GFb,). Our construction yields DGRAs with only 
one single state, provided we use the unfolding optimization presented in Section 7. 
Indeed, a simple induction shows that for every formula y in the fragment and for 
every v € 2^?, we have Unf(af(y,v)) =p Unf(p). Therefore, if we take Unf (p) as 
the initial state, the co-Büchi automaton only has one reachable state. By a similar 
argument, replacing af by afg, the Mojmir automaton M(w) for a G-subformula 
Gy also has one single state, and the same holds for its corresponding Rabin 
automaton. Since every component of the parallel composition only has one state, 
the same holds for the parallel composition itself. Note that without the unfolding 
optimization the co-Büchi automaton for A7 ., FGa; would have 2" states. 


9 Implementation and Experimental Results 
9.1 Implementation 


The construction is implemented in a tool Rabinizer 3, which was reported on 
in [KK14]. It is written in Java and uses JavaBDD to work with formulae as 
Boolean functions. Furthermore, in order to optimize the construction time, we 
have implemented a new version 3.1 of the tool." It uses BDDs also for labelling 
edges in automata and explores the state space in this more symbolic way rather 
than examining successors for each valuation separately. 

The implementation allows to choose between the mechanically proved con- 
struction and switching on any subset of the described optimizations. Furthermore, 
apart from producing the resulting transition-based generalized Rabin automata, 
it can also convert the result to state-based automata as well as degeneralize them 
into Rabin automata. 

Finally, there is a choice of output formats: dot format, useful for graphical 
representation, e.g. by dotty or Graphviz; and the HOA (Hanoi omega-automata) 
format, the new standard [BBDL 15], nowadays implemented by other transla- 
tors as well as PRISM. This allows for linking Rabinizer to PRISM, resulting in a 
significantly faster probabilistic LTL model checker, see [CGK13, KK 14]. 


9.2 Experimental results 


We compare the performance of the following tools and methods in terms of the 
number of states of the resulting automata. 


(L*) 1t12dstar [Kle05] implements and optimizes [KB07] Safra's construction [Saf88]. 
It uses LTL2BA [GOO1] to obtain the non-deterministic Büchi automata (NBA) 
first. Other translators to NBA may also be used, such as Spot [DL13] or 
LTL3BA [BKRS12] and in some cases may yield better results (see [BKS13] for 
comparison thereof), but LTL2BA is recommended by 1t12dstar and is used 
this way in PRISM [KNP11]. 


7 http://www" in.tum.de/" kretinsk/rabinizer3.html 
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(R1/2) Rabinizer [GKE12] and Rabinizer 2 [KLG13] implement a direct construction 
based on [KE12] for fragments LTL(F, G) and LTIAau*, respectively. The 
latter tool is applied here only on formulae not in LTL(F, G). 

(L3) LTL3DRA [BBKS13] implements a construction via alternating automata, which 
is “inspired by [KE12]” (quoted from [BBKS13]) and performs several optimiza- 
tions. 

(R3) Rabinizer 3.1 performs our new construction. Unless specified otherwise we 
employ the previously described optimizations. Notice that we produce a state 
space with a logical structure, which permits many further optimizations; 
for instance, one could incorporate the suspension optimization of LTL3BA 
[BBDL* 13]. 


For L* and R1/2 we produce DRAs (although Rabinizer 2 can also produce 
DGRAs) with state-based acceptance conditions. For L3 and R3 we produce 
DGRAs with transition-based acceptance conditions (tDGRAs), which can be 
directly used for probabilistic model checking without any blow-up [CGK13]. 
Inapplicability of a tool to a formula is denoted in tables by —. All automata in 
this section were constructed within a few seconds, with the exception of the larger 
automata generated by 1t12dstar: it took several minutes for automata over ten 
thousand states and hours for hundreds of thousands of states. The automaton for 
A2 4(GFa; — GFbi) took even more than a day and “?” denotes a time-out after 
one day. 

Table 1 shows formulae of the LTL(F, G) fragment. The upper part comes from 
BEEM (BEnchmarks for Explicit Model checkers) [Pel07], the lower one from [SB00] 
on which 1tl2dstar was originally tested [KB06]. There are overlaps between the 
two sets. All the formulae were used already in [KE12, BBKS13]. Although more 
general, our method usually achieves the same results as the optimized LTL3DRA, 
outperforming the first two approaches. 

Table 2 shows formulae of LTL\gu used in [KLG13]. The first part comes 
mostly from the same sources and [EHO00]. The second part is considered in [KLG13] 
in order to demonstrate the difficulties of the standard approach to handle 


1. many X-operators inside the scope of other temporal operators, especially U, 
where the DRAs are already quite complex, and 

2. conjunctions of liveness properties where the efficiency of generalized Rabin 
acceptance condition may be fully exploited. 


Table 3 contains formulae of the general LTL. The first part contains two 
randomly picked formulae illustrating the same two phenomena as in the previous 
table now on general LTL formulae. The second part contains two examples of 
formulae from a network monitoring project LiBEROUTER?. The third part contains 
five more complex formulae from SPEC PATTERN [DAC99]'° and express the 
following “after Q until R” properties: 


p35 : G(Iq V (Gp V (IpU(r V (sAlp ^ X(IpUt)))))) 
pao: G(Iq V (((Is V r) v X(G(It v r)VIrU(r ^ (X V r)))U(r V p) V G((Is v XG!t)))) 


B LTL\ cu was introduced in [KLG13] and disallows occurrences of U in the scope of G. 
9 https:/ /www.liberouter.org/ 


10 Spec Patterns: Property Pattern Mappings for LTL. 
http:/ /patterns.projects.cis.ksu.edu/documentation/patterns/ltl.shtml 
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Table 1: Experimental results on LTL(F,G)-fragment 


Formula L* R1 L3 R3 
G(a V Fb) 442 23 
FGa v FGb v GFc e. B bod 
F(a v b) 2 2 2 2 
GF(a V b) 2 21 1 
G(av bVc) 3.222 
G(a v F(b V c)) 4 42 2 
Fa V Gb 4 3 3 3 
G(aV F(bA c)) 4 42 2 
FGa v GFb) 441 1 
GF(a v b) ^ GF(b v c) Tout qr 
FFa ^ Gua) V (GG-^a ^ Fa) 101 2 
GFa) ^ FGb 3 3 1 1 
GFa ^ FGb) v (FG-a ^ GF-b) ae ae 
FGa ^ GFa 2 2 1 1 
G(Fa ^ Fb) 5 313 
Fa ^A F-a 4 444 
G(b v GFa) ^ G(c V GF-2a)) V Gb V Gc|13 18 4 4 
G(bv FGa) ^A G(c V FG~a)) V Gb V Gc|14 6 4 4 
F(bAFGa)VF(cAFG-a)^Fb^AFc |7 5 4 4 
F(b A GFa) V F(cA GFra)) \FbA Fe | 7 5 4 4 


Table 2: Experimental results on LTL\q@y -fragment 


Formula L* R2 L3 R3 
Fp)U(Gq) 4 3 2 2 
Gp)Uq 5 5 5 5 
pV q)Up V Gq 4 3 3 3 
G(!p V Fq) ^ ((Xp)Uq V X((Ipv!q)U!p V G(lpv!q))) 19 8- 5 
G(q V XGp) ^ G(r V XG!p) 5 14 4 4 
X(Gr V rU(r ^ sUp)))U(Gr V rU(r ^ s)) 18 9 8 8 

pU(q ^ X(r ^ (F(s ^ X(F(t ^ X(F(u^ XFv))))))) 9 13 13 13 
GF(a A XXb) V FGb) A FG(cV (Xa ^ XXb)) 353 73 — 12 
GF(XXXa ^ XXXXb) ^ GF(b V Xc) A GF(c^ XXa)| 2127169 — 16 
GFa V FGb) ^ (GFc V FG(d V Xe)) 18176 80 — 2 
GF(a ^ XXc) V FGb) ^ (GFc V FG(d V Xa ^ XXb)) ?142 — 12 
aUb ^ (GFa V FGb) ^ (GFc v FGd)v 640771210 8 7 

VaUc ^ (GFa V FGd) ^ (GFc V FGb) 


yas: G(IgV(lsvX(GWYtvIrU(rAYX)) v X(IrU(rAEp)))U(rvG(lsvX(G!tv!rU(rAWM))V 
X(!rU(t ^ Fp))))) 

ps0 : G(Iq V (Ip V (IrU(sAlr ^ X(IrUt))))U(r v G(lp V (s ^ XFt)))) 

pss: G(Iq V (Ip V (IrU(sAlrA!z ^ X((IrA!z)Ut))))U(r v G(lIp V (sAlz ^ X(!zUt))))) 


Here we also compare unoptimized and optimized versions of our construction. 
Table 4 contains formulae of the general LTL generated randomly by randltl 
[DL13]. The first part contains formulae over at most 4 atomic propositions and 
of length 10 to 20 (before simplifications enforced by rand1t1). The second part 
contains formulae over at most 8 atomic propositions and of length 15 to 50 (before 
the simplifications). Each part contains 1000 formulae and their negations; time-out 
per formula was set to 1 minute, using 1tlcross [DL13]. Since the formulae are 
of general LTL we compare only 1t12dstar and (optimized) Rabinizer 3.1. In 
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Table 3: Experimental results on general LTL 


Formula L* R1/2 L3 R3-unopt. R3-opt. 
FG((a ^ XXb ^ GFb)U(G(XX!e v XX(a ^ b)))) 2053 — — 9 3 
G(F!a ^ F(b ^ X!c) ^ GF(aUd)) ^ GF((Xd)U(b v Go)) 283 — — 25 7 
GUDJA (p2U (92) U (153) V p4))) (Mmm 6 — 4 
G(((p1) ^ X!p1) V X(p1U(((1p2) ^ p1)^ 8 — — 12 9 
X(p2 ^ pl ^ (p1U(((1p2) ^ pl) ^ X(p2 ^ p1)))))) 
935 : 2 cause-1 effect precedence chain 6 — — 9 6 
940 : 1 cause-2 effect precedence chain 314 — — 16 16 
p45 : 2 stimulus-1 response chain 1450 — — 81 68 
950 : 1 stimulus-2 response chain 28 — — 36 21 
(955 : 1-2 response chain constrained by a single proposition| 28 — — 36 21 


addition, we also provide comparison to Spot [DL13], one of the most efficient tools 
producing non-deterministic transition-based generalized Büchi automata (the 
tables displays the percentage of actually non-deterministic automata produced). 
'The table provides the average and maximal number of states per automaton on 
each set; for the more complex set we also provide the percentage of automata 
greater than 10, 100, and 1000 states (among the results that did not time out). 
Finally, we also display the percentage of results smaller than, equal to, and greater 
than those of Rabinizer 3.1. 


Table 4: Experimental results on general LTL 


L* R3 Spot(non-det. TGBA) 
avg size 22.4 (0.1% time-outs) 4.3 3.1 (39% non-det.) 
max size 5815 65 19 
</=/>R3 11% / 34% / 55% 59% / 28% / 13% 
avg size 839.5 (13% time-outs) 12.2 (3% time-outs) | 6.0 (65% non-det.) 
max size 86 896 387 78 
</=/>R3 13% / 17% / 70% 63% / 18% / 19% 
>10 / >100 / >1000| 51% / 20% / 4% 27% / 4% / 0% 12% / 0% / 0% 


9.3 Advantages and limits of the approach 


In this section, we focus on formulae with extremely complex acceptance conditions. 
This is caused by combinations of “infinitary” behaviour, whose satisfaction does not 
depend on any finite prefix of the word. A typical example is the “fairness”-fragment 
given by A of Section 8. In this case, our DGRA have only one state. While DRA 
need to remember the last letter read, the transition-based acceptance together 
with the generalized acceptance condition allow transition-based DGRA not to 
remember anything. Such formulae are the most difficult for our as well as the 
traditional determinization approach. 
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The formulae from Table 5 were used before in [KE12, BBKS13] and include 
fairness-like constraints. 


Table 5: Experimental results on “fairness’-fragment given by A of Section 8 


Formula L* R1 L3 R3-unopt. R3-opt. 
(FGa v GFb) 4 4 4 1 
(FGa v GFb) ^ (FGc v GFd) 11324 18 16 1 

3 (FGa; V GFb;) 1304706 462 64 1 
GF(Fa V GFb v FG(a Vb) 14 4 3 1 
FG(Fa v GFb v FG(a v b)) 145 41 4 1 
FG(Fa v GFb v FG(a v b) v FGb) 181 4 1 4 1 
(GFa V FGb) 4 4 1 4 1 
(GFa v FGb) ^ (GFb v FGc) 572 11 9 1 
(GFa v FGb) ^ (GFb v FGc) ^ (GFc v FGd) 290046 52 17 1 
(GFa v FGb) ^ (GFb v FGc) ^ (GFc v FGd) ^ (GFd v FGh) ? 1288 1 33 ï 


Table 6 shows that it is very beneficial to use the generalized Rabin acceptance. 


Furthermore, using transition-based acceptance even more states are saved. 


Table 6: Experimental comparisons of acceptance conditions. We display number of states and 
acceptance pairs for 1t12dstar and Rabinizer 3 producing different types of automata, all 
with the same number of pairs. Here v; = FG(((a ^ XXb) ^ GFb)UG(XX!c V XX(a ^ b))) 
and v» = G(!q V (((Is Vr) V X(G(It v r)Vv!rU(r ^ (tt V r))))U(r V p) v G((!s v XG!£)))), the 
latter being p40 “1 cause-2 effect precedence chain" of SPEC PATTERNS 


Formula: ltl2dstar i Rabinizer 3 i 

DRA states|pairs|DRA st.|DGRA st.[tDGRA st.|pairs 
FGa V GFb 4 2 4 4 1 2 
(FGa v GFb) ^ (FGc v GFd) 11324| 8 21 16 1l 4 
A? (GFa; — GFb;) 1304706| 10 511 64 1 8 
N (GFa; — GFaj+1) 153558 8 58 17 J 8 
pi 40 4 4 4 3 1 
pa 314 7 21 21 16 4 


However, when the the automata are used for probabilistic model checking, 
transition-based acceptance does not improve the results so much. Indeed, although 
state-based DGRA are larger than their transition-based counterpart tDGRA, 
the respective product is not much larger (often not at all), see Table 7. For 
instance, consider the case when the only extra information that DGRA carries in 
states, compared to tDGRA, is the labelling of the last transition taken. Then this 
information is absorbed in the product, as the system’s states carry their labelling 
anyway. Therefore, in this relatively common case for simpler formulae (like the 


one in Table 7), there is no difference in sizes of products with DGRA and tDGRA. 


Further, notice that the DGRA in Table 7 is larger than the DRA obtained 
by degeneralization of tDGRA and subsequent transformation to a state-based 
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Table 7: Model checking Pnueli-Zuck mutex protocol with 5 processes (altogether m = 308 800 
states) from the benchmark set [KNP11] for the property that either all processes 1-4 enter the 
critical section infinitely often, or process 5 asks to enter it only finitely often 


L* DRA| R3 DRA|R3 DGRA[R3 tDGRA 
Automaton size (and nr. of pairs) 196 (5) 11 (2) 33 (2) 1(2) 
Product size 13 826 588/1100 608| 308 800 308 800 
"Effective" size of automaton — Product size/m 44.78 3.56 1 1 


automaton. However, the product with the DGRA is of the size of the original 
system, while for DRA it is larger! This demonstrates the superiority of generalized 
Rabin automata over standard Rabin automata with respect to the product size 
and thus also computation time, which is superlinear in the size. 

Finally, Table 8 compares the running times for the discussed fairness-fragment. 


Table 8: Running times for constructing an automaton and its acceptance condition for fairness 
constraints AE 4 (FGa; V GFb;) for different k. Times are given in seconds with time-out 
(blank space) after one hour. The experiments were run on an 2.8 GHz Intel Core i7 with 
8 GB memory. Here we also compare to Rabinizer 3 of [KK14], denoted by R3.0, where all 
transitions are handled separately, as opposed to a symbolic encoding into edges of Rabinizer 
3.1, denoted by R3.1 


k | L* R1 L3 R3.0 R3.1-unopt. R3.1-opt. 
1 |0.15 0.10 0.01 0.04 0.12 0.12 
2 | 4.3 0.19 0.01 0.08 0.29 0.14 
3 5.7 0.03 0.38 2.1 0.24 
4 0.19 3.8 22 0.54 
5 1.9 105 640 1.2 
6 25 4.1 
6 350 17 
8 86 
9 670 
10 


10 Formalization in Isabelle 


We have mechanically verified the proof of correctness of our construction using the 
Isabelle theorem prover’, which provides a rich library of formalised mathematics 
and convenient support for proof development. A detailed introduction can be 
found in [NPW02]. Similar work was pioneered by the CAVA project!?, which 
already verified a range of automata-theoretic algorithms [ELN * 13]. In fact some 
of the theories developed in the context of the CAVA project are also reused in our 
work. The formalization was carried out by one of us, and constituted his Master's 
thesis. The formal proof can be found at [Sic15], and consists of around 11000 lines. 


11 https: //isabelle.in.tum.de/ 
12 https: //cava.in.tum.de/ 
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10.1 Relation between formalisation and the content of this paper 
The formalization is split into several “theories”. A theory is just a collection of 


definitions and results, which can reuse results from other theories. Our theories 
are listed in Table 9. 


Table 9: Important theories and their content. 


LTL. thy Syntax and semantics of LTL. 

af .thy The af and afq functions and their properties. 
Logical_Characterization.thy The logical characterization theorems. 
Mojmir.thy Mojmir automata. 

Rabin. thy (Generalised) Rabin automata. 
Mojmir_Rabin.thy Translation from Mojmir to Rabin automata. 
LTL_Rabin. thy Translation from LTL to DGRA. 
LTL_Rabin_Unfold_Opt.thy Unfold optimisation of the general translation. 


For the main definitions, lemmas, and theorems of this paper, Table 10 shows 
their corresponding name and location in the formalized theories. With the help of 
this table, interested readers can establish the correspondence between our results 
and their formal versions. For example, we reproduce here Theorem 7 next to the 
formal version in the mechanized proof: 


Theorem 13 (Logical characterization theorem III) For every LTL formula 
FG«q and every word w: w = FG iff there exists a closed set G C G(FGy) for w 
containing Gy. 


theorem 1t1_FG_logical_characterization: 
"w E FGp — (3G C G(FGy). GYEG ^ closed G w)" 
(is "?lhs ——9 ?rhs") 
proof 
assume ?lhs 
hence "Gy € Gro(FGy) w" and "Grg(FGy) w C G(FGy)" 
unfolding Cr; alt def by auto 
thus ?rhs 
using closed Orc by metis 
ged (blast intro: closed, FG) 


Note that there are several differences between the formulation of the theorem 
in the paper and in the formalized theories. 


— Unbounded variables such as w and y are implicitly universally quantified. 

— The type system automatically deduces the types of w, which is an w-word, and 
y, which is an LTL formula, using the signature of the operator E. Thus the 
type annotations are omitted. 

— Since we cannot use the whole range of mathematical symbols and notation 
due to technical constraints, alternative notation is used. In this instance G is 
replaced by G, and Gu(y) by Gra q w. 
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The theorem declaration is then followed by the proof body, which is written in 
the proof language Isar. In every proof step facts are established using the keywords 
have, hence, show, and thus. These claims then have to be proven using a proof 
method, such as blast, metis, and auto. Furthermore, we can pass additional facts 
to these methods using parameters such as intro, dest or via the using keyword. 
All remaining proof goals, in this case that the right hand side implies the left, 
are proven with the method behind ged. A detailed explanation of the language is 
given in [Wen07], while the whole specification can be found in [Wen14]. 

Note that some definitions and claims, like for instance Proposition 1 and 
Theorem 3, have no counterpart in the formalisation, as they only illustrate 
different aspects of the construction, but are not an essential part of it. In the 
first case, we directly define LTL in negation normal form and do not include a 
translation method, while in the second case the theorem is just a special case of 
Theorem 7 and thus left out. 


Table 10: Location of definitions, lemmas and theorems. 


Def. 2 LTL. thy 1tl_semantics 

Def. 4 LTL. thy 1tl_prop_entailment 

Def. 6 af .thy af_letter, af 

Lem. 1 af .thy af_nested_propos, af_simps, 
af_respectfulness 

Prop. 2 af .thy af ltl continuation 

'Thm. 1 Logical Characterization.thy | 1tl implies provable 

Lem. 2 Mojmir.thy rank None. Suc, rank monotonic 

Lem. 3 Mojmir.thy state rank step 

Lem. 4 Mojmir.thy token succeeds run merge, 
token squats run merge 

Lem. 5 Mojmir.thy mojmir. accept iff token set accept 

Lem. 5 Mojmir.thy Stable rank bounded 

'Thm. 6 Mojmir Rabin.thy mojmir accept iff rabin accept 

Def. 17 af.thy af, G letter, afc 

Lem. 7 Logical Characterization.thy | closed Crc, closed FG 

'Thm. 7 Logical Characterization.thy | ltl FG logical characterization 

Lem. 8 af.thy afg sat core 

'Thm. 9 LTL Rabin.thy ltl FG to generalised rabin, correct 


Lem. 10 | Logical Characterization.thy ^ almost all suffixes model F 
Thm. 10 Logical Characterization.thy ^ tl logical characterization 


Thm. 11  LTL Rabin.thy F_eq_S 
Lem. 11 Mojmir.thy token_accepting_rank 
Thm. 12 LTL Rabin.thy ltl to generalised, rabin correct 


10.2 Merits of the Mechanization 


While the effort invested in the mechanization of the proof has been very consider- 
able (about 8 person-months of a master student who had taken an introductory 
course on Isabelle), it has helped to identify several bugs in the construction we 
presented in [EK 14], the conference paper preceding this one. All but one concerned 
corner cases that were arguably not very relevant. For example, the translation 
from a Mojmir to a Rabin automaton was incorrect for the case in which the 
Mojmir automaton has one single state, which is at the same time an accepting 
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state. However, one bug was more serious. Lemma C of our conference paper was 
wrong, due to a mistake in the proof. The proof was carried out by induction 
over the structure of LTL formulae. Since our attempts at mechanizing the proof 
obviously failed, we repeatedly tried to correct the argument by nesting induction 
proofs. This process eventually lead to the smallest to us known formula for which 
the lemma fails: G(Xa v GXb). Observe that the formula is already long enough 
to have a good chance of surviving random testing. Moreover, testing can only 
be performed with respect to another tool producing DRAs from formulae, which 
could itself have a bug, and the test requires to check equivalence of deterministic 
Rabin automata, which is a complicated task. Finally, we do not know of any 
reasonable way of certifying an LTL to DRA translation, that is, of making the 
tool produce a certificate of correctness that can be checked by independent means. 

After these experiences, we consider automata-theoretic constructions used 
in model checking tools to be an area in which mechanized proofs are highly 
desirable, if not necessary. Many of the constructions are very clever and involved. 
Moreover, while they often rely on relatively simple intuitions, their correctness 
proofs often involve detailed case analyses. Since the constructions become part of 
model checkers, which for the most part are used to find bugs in other systems, bugs 
in the construction itself can have a multiplying effect. Finally, as mentioned above, 
there is no simple direct way to test the tools. However, we can take the following 
indirect approach. Firstly, given a formula y, we construct the corresponding 
automaton A. Secondly, we model check the transition system of A against the 
formula ¢ +— p where the LTL formula p encodes the acceptance condition of A. 
'The translation is correct if and only if the model checking procedure does not find 
any violation. This approach relies on a verified model checker (for non-probabilistic 
systems). 


11 Conclusions 


We have presented the first direct translation from LTL formulae to deterministic 
Rabin automata able to handle arbitrary formulae. The construction is compo- 
sitional. Given o, we compute (1) a transition system for y, automata for each 
G-subformula of y, and their parallel composition, and (2) the acceptance condi- 
tion: we first guess a set of G-subformulae that are true (this yields the accepting 
states of automata for G-subformulae), and then guess the ranks (this yields the 
information for a co-Büchi acceptance condition of the whole product). 

'The compositional approach together with the logical structure of states open 
the door to many possible optimizations. Since the automata for G-subformulae 
are typically very small, we can aggressively try to optimize them, knowing that 
each reduced state in one potentially leads to large savings in the final number of 
states of the product. So far we have only implemented a few simple optimizations, 
and we think there is still much room for improvement. 

We have provided a mechanized proof of the construction, which has also led 
to discovery of a serious bug in the original construction [EK14]. 

We have conducted a detailed experimental comparison. Our construction 
outperforms two-step approaches that first translate the formula into a Büchi 
automaton and then apply Safra's construction. Finally, we produce a (often much 
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smaller) generalized Rabin automaton, which can be directly used for probabilistic 
verification, without further translation into a standard Rabin automaton. 
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Lemma 1. For every formula ¢ and every finite word w € (24?)*: 


1) af(y, 


w) is a boolean combination of proper subformulae of q. 


2) If af (o, w) = tt, then af (p, ww’) = tt for every w’ € (2^?)*, and analogously for ff. 


( 

( 

(3) If yi =p p2, then af (p1, w) =p af(y2,w). 
( 


4) If p has n proper subformulae, then the set of formulae reachable from y has at most 92" 
equivalence classes of formulae with respect to propositional equivalence. 


Proof (1) By structural induction on y. 


(2) Follows immediately from af (tt, v) = tt and af (ff, v) = 
(3) By (1) every formula ọ is a positive boolean combination of proper formulae. Since 


af 
af 


distributes over ^ and V, the formula af(y,v) is obtained by applying a simultaneous 
substitution to the proper formulae. (For example, 


a proper formula Gy is substituted by 


ij, v) ^ GY.) Let y[S] be the result of the substitution. 
Consider two equivalent formulae yi =p 2. Since we apply the same substitution to 


both sides, the substitution lemma of propositional logic guarantees q1[S] =p qo[S]. So 


af 


induction on the length of w. 


(4) Follows from (1) and the fact that there are 22" 


with n variables. 


qQ1,v) =p af(qa,v) for a letter v. The general case af(y1,w) =p af (¢2, 


w) follows by 


equivalence classes of boolean formulae 


Proposition 2. Let y be a formula, and let ww! € (24?)” be an arbitrary word. Then ww!’ = p 


iff w’ 


E af (e, w). 


Proof First we prove the property when w is a single letter v: 


vw Ew iff w 


E af (e. v) 


We prove (2) by structural induction on y. We only consider two representative cases. 


- p=a. Then 


w! |= vw! 


hence a € v hence a ¢ v 


Ka 


(semantics of LTL) 


hence af(a,v) — tt hence af(a,v) = ff (def. of af) 
hence w’ F af(a,v) hence w jÆ af (a,v) 
- p = Fy’. Then 
vw' = Fy’ 
iff vw! E (XFy’) v v (Fe! = XFo' v o) 
iff. (w E Fey’) v (vw E e) (semantics of LTL) 
iff. (w' E Fy’) v (w E af(e.v)) (ind. hyp.) 
iff w H Fy’ v af (ol, v) (def. of af) 
iff w H af (Fy’,v) (def. of af) 
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Now we prove the property for every word w by induction on the length of w. If w = e 
then af (y, w) = y, and so ww’  ¢ iff w' = ọ iff w’ E- af (p, w). If w = vw" for some v € 2^7, 
then we have 


dbaor 
iff w H af (p, vw") 
iff w H af (af (p, v), w”) (def. of af) 
iff ww E af (p, v) (ind. hyp.) 
iff vw”w Ev (2) 
iff ww’ = 


Lemma 5. Let i be the rank of condition (2) in Theorem 5. If the rank of 7 stabilizes, then 
strkw(T) <i. 


Proof We first prove the following two claims, where i is the rank of condition (2): 


(a) If T succeeds at rank i, then strkw(T) < i. 
Since 7 has rank i when it reaches the accepting states, we clearly have strkw(T) < i. 
We show strkw(T) < i. Assume the contrary. With the previous observation, we have 
strkw (T) = i. Let t be some time at which 7 has already entered the accepting states, and 
its rank has stabilized. By (2.1), some token 7’ born after time t (i.e., T’ > t) also succeeds 
at rank i. Let t' > t be the time immediately before 7’ enters the accepting states. Then we 
have rk, (T, t^) = i, because at time t’ token 7 has already stabilized, and rk (7/,t^) =i 
by definition. But at time t’ token 7 is in some accepting state, while 7’ is not. So we have 
two tokens in different states with the same rank, contradicting the definition of rank. 

(b) If rk (T, 0) € rkw(7’,t) = strkw(r’) € N, then rky(r,t) = strku (T). 
(If a token has reached its stable rank at some time t, then so have all tokens of older rank.) 
Assume rk,,(7,t) Æ strk, (T). Then at some time t’ > t the rank of 7 either becomes L 
(because 7 reaches a sink) or improves (because 7's firm merges with a firm of older rank). 
In both cases, the rank of rk, (7',t) also improves (because the rank of 7 becomes vacant), 
contradicting the assumption that at time t token 7 has already reached its stable rank. 


Assume now that the rank of 7 stabilizes but strk, (T) > i. By (2.1), some token 7’ born 
after the rank of 7 stabilizes succeeds at rank i. Since qo ¢ F this token eventually enters the 
accepting states. Let t be the time immediately before 7’ enters the accepting states. We have 
rk (7', t) =i. Since strk, (7) > i, we have rky(T, t) > i= rkw(r’,t). By (b) (with the roles of 
T and 7 reversed), we get rk, (T',t) = strky(r'), and so strky(7’) = i. But, since 7’ succeeds 
at rank i, this contradicts (a). 


Proposition 5. Let Mı = (Q1,2,q01,01, F1) and M2 = (Qo, 5’, qo2, 62, F2). Let Q = 
Q1 X Qa, let qo = (901, G02), and let 0: Q x X — Q be the function given by ó(q1,q2,v) = 
(61(q1, v), 02(qo, v)) Then the tuples 


Mı Ma = (Q, X, q0, ô, Fi x F2) 
Mı UM2 = (Q, X, q0, ô, (Fi x Q2) U (Q1 x F2)) 


are also Mojmir automata, and moreover L(M1M M2) = L(K1) A L(K2) and L(Mı U M2) = 
L(K1) U L(K2). 


Proof We have to show that states reachable from an accepting state of M1 NMa or M1 U Mə 
are again accepting. If (q1, q2) is an accepting state of Mı NM2 or Mı UM2, then by definition 


ôl (q1, q2), v) = (01 (a1, v), ó2(q2, v)). 


- If (q1,q2) € Fi x Fo, then, since Mı and M2 are M automata, we have 61(q1,v) € Fi 
and ó2(qo,v) € Fe, and so ó((q1,q2),v) € Fi x F5. 

- If (q1,¢2) € (F1 x Q2) U (Qı x F3), then, since Mı and Mz are M automata, we have 
ó(qi,v) € Fi or ó(qz,v) € F2, and so ó((q1,q2),v) € (Fi x Q2) U (Q1 x F2). 
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We now prove L(M1 N M2) = L(K1) A L(K2) and L(Mı U M2) = L(K1) U L(K2). Since 
Mı 1 Me and Mı U Mo only differ in their accepting states, they have the same function 
TUNw(T, t) describing the position of token 7 at time t. Moreover, by the definition of qo and 6 
we easily get 

TUNw(T, t) = (runi, (T, t), runZ2w(r,t)) 


where runi and run2 are the corresponding functions for Mı and Mo. So we have 


(a) Token 7 of Mı N M2 eventually reaches Fi x F iff the token T of Mı eventually reaches 
F and the token 7 of M2 eventually reaches F5. 

(b) Token 7 of Mı U Mg eventually reaches (Fy x Q2) U (Qi x Fə) iff the token 7 of Mı 
eventually reaches Fi, or the token 7 of M2 eventually reach F5. 


By (a), almost every token of M1MMg2 eventually reaches F1 x F5 iff almost every token of Mı 
eventually reaches F}, and almost every token of Mə eventually reaches F5. So L(.Mt1 M3) = 
L(K1)nL(K2). By (b), almost every token of M1NMg2 eventually reaches (F1 x Q2)U (Q1 x F2) 
iff almost every token of Mı eventually reaches Fi, or almost every token of M2 eventually 
reaches F5. So L(Mı U M2) = L(K1)U L(K2) 


Lemma 7. Let y be a formula and let w be a word. 


(a) Every set G C Gy closed for w is included in C, (o). 
(b) Guw(y) is closed for w. 


Proof (a): Given G C Gy, we inductively assign to every Gv € G an index as follows. If 4) 
has no G-subformulae, then Gw has index 0; if » has G-subformulae, then its index is the 
maximum of the indices of its subformulae plus 1. 

Assume G C G(y) is closed for w, and let Gy € G. We prove w = FGy by induction on 
the index n of Gw. 


— n= Q. Since G is closed for w, we have G Fp af a (v,wij) for almost every i € N and 
almost every j > i. Let j > i be such that G Ep afa(Y, wij) holds. Since ~ has no 
G-subformulae (because n = 0), the formulae of G occur neither in y% nor, by the definition 
of afa, in af a (V, wij). So we get Ø Ep af a (v, wiz), which implies afa (Y, wij) =p tt. 
Moreover, since 1» has no subformulae and af œ and af only differ on G-formulae, we have 
af a (V, wiz) = af (v, wij). So we finally obtain af (p, wij) =p tt for almost every i € N 
and almost every j > i. Apply now Theorem 3. 

— n » 0. Let G’ be the set of formulae of G that are subformulae of 7. For every G« € G’ 
the index of Gy’ is at most n — 1 and so, by induction hypothesis, we have w = FGw’. So 
there exists kı such that w; = g’ for every i > ky. 

Moreover, since G is closed for w, we have G =p af G (v, wij) for almost every i € N and 
almost every j > i. Further, since the formulae of G V G’ do not appear in any af a (v, wij), 
there exists kg such that G' Ep af (V, wij) for every i > ko and almost every j > i. 
Taking k = max(Kki, k2}, we obtain: 

(i) w; EG’ for every i > k, and 

(ii) G’ Ep af a (v, wij) for every i > k and almost every j > i. 

We show that (i) and (ii) imply w; = v» for almost every i > k. We proceed by an structural 
induction on v, very similar to the one in the proof of Proposition 2, except for the case 
ij = Gv’. We omit some cases, and only sketch the proof of others. 

— 4j =a. Let i > k such that (i) holds. By (ii) we have G’ Ep af (a, wij) for almost 
every j > i, and so af g(a, wij) = tt for almost every j > i. But af g(a, wij) = tt 
implies w,(;41) = a, and so w; = a. 

— p= V4 ^ V» and Y = v1 V v». Both cases follow immediately from the induction 
hypothesis. 

— «4j = Gy". By the definition of afa, we have af a (V, wij) = Gy’ = wv for every j > i. 
So, by (ii), we have G’ Ep v which, together with (i), implies w; = v for every i > k. 


(b): We first prove a preliminary result: if w = o, then Gu(y) = afa (v, woi) for almost 
every i € N. The proof is very similar to that of Theorem 1. It suffices to say that we proceed 
by structural induction on o, using the same arguments as in Theorem 1, with two minor 
adjustments: 


— af «(e woi) =p tt is replaced by Gw(y) E af a (o; woi). 
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— The G-case, i.e., p = Gy’, is proved differently. It follows immediately from the fact that, 
since w |= Gy’ by assumption, we have Gy’ € Gw(Gy’). 


Now we proceed to prove (b), also by structural induction on y. If q is not a G-formula, then 
the result follows either directly from the definitions or directly from the induction hypothesis. 
So consider the case o = Gy’. By definition we have Gu(y’) C Gu(y), and by induction 
hypothesis Gi, (v) is closed for w. If w K FGy’ then Gu (v!) = Gw(v), and so Gu (p) is closed 
for w. If w =| FGy’ then Gu (v) = Gulp’) U (Gv'). Since Gu (v) is closed for w, we have 
Gu(y’) Ep af a (v, wiz) for almost every i € N, almost every j > i, and for every GY € Gu(y’). 
So it suffices to show Gu(y~) Ep af (v, wij) for almost all every i € N and almost every j > i. 
Since w = FGy’, we have w; E- y’ for almost all i € N. Applying the preliminary result above 
to every w;, we obtain Gu (v) Fp af a (v, wij) for almost every i € N and almost every j > i, 
and we are done. 


Lemma 8. Let ọ be a formula and let G C G(y). For every v € Reachg(y) and every v € 2^7, 
if G |p v then G Ep af a(v,v). 


Proof We proceed by induction on the structure of w. Since G Fp v, by the definition 
of propositional implication, the formula ~ must be either tt, a conjunction, a disjunction, 
or a G-formula. If y = tt then af (v,v) = tt and we are done. If y = pı ^ v» then 
afa(v,v) = af a(1, v) ^ af a(V2,v) and G =p af a(v.v) follows immediately from the 
induction hypothesis. The case 7 = V1 V %2 is analogous. Finally, if Y = Gy’ for some formula 
wy’ then af G(Gv/) = Gy’, and we are done. 


Lemma 11. Let M(w,G) be the Mojmir automaton for a formula v. Assume M (15, G) accepts 
a word w at the smallest accepting rank r. For almost every t € N and for every token 7 of the 
run of M(w~,G) on w, the token succeeds iff 


l.T»t,or 
2. srw(runw(T,t),t) > r, or 
3. runw(T, t) € F. 


Proof Consider the accepting run of M(w,G) on w. Let k’ be large enough such that at time 
t' > k': all tokens 7 born after k’ eventually succeed; the finitely many tokens that fail have 
already reached a sink; and the finitely many tokens that succeed with rank smaller than r have 
already already reached an accepting state. Notice that such a k’ only exists for the smallest 
accepting rank, since infinitely many tokens enter the accepting states with this rank and for 
all larger accepting ranks this constant does not exist. Furthermore let k > k’ be large enough 
so that all squatting tokens born before or at time k’ have already reached their stable rank at 
time k. We show that the lemma holds for every t > k. 
Let r be an arbitrary token. 


- Assume 7 succeeds. We show that if (1) and (3) do not hold, then (2) holds. By (3), 7 
has not yet reached the accepting states. By our choice of k’, by the time 7 enters the 
accepting states it will have rank r or larger. Since the rank of a token can only decrease, 
its current rank is also equal to the accepting rank r or larger. So srw(runw(7,t),t) >r. 

- Assume (1), (2), or (3) hold. If (3) holds, then 7 succeeds by the definition of success. If (1) 
holds, then 7 succeeds by our choice of k’. Assume now that (2) holds. We show that (2) 
neither fails nor squats outside the accepting states, and so necessarily succeeds. Since T 
has a rank at time t, it is not in a sink, and so, by our choice of k’, the token does not fail. 
To show that 7 does not squat outside the accepting states, we recall part (c) in the proof 
of Theorem 5: the stable rank of a token is bounded from above by accepting ranks, thus 
also by the smallest. So, by (2), the rank of 7 has not stabilized yet, and therefore, by our 
choice of k, it does not squat outside the accepting states. 
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